Introduction to Open Policy Agent (OPA) Rego Language

Jacob Martin - Sep 16 '22 - - Dev Community

Policy as Code has been a very hot topic recently. It allows you to codify your rules and decision-making to execute them in an automated way. This lets products expose a programmatic, composable language for policies instead of having to build very complex purpose-specific UIs with all options available. One of the most – if not the most – popular Policy as Code engines is Open Policy Agent, used in many projects, like Kubernetes and Envoy, and also extensively used in Spacelift. It’s open source and uses the Rego language for policy authoring.

Rego, however, is a language that works very differently than most and can be quite unintuitive at first glance. It’s actually more similar to SQL than to common imperative languages like Python. This means that the learning curve can be quite steep. Moreover, copy-paste development will very often not help you understand Rego – and authoring complicated policies – better.

This is precisely why I wrote this article. It’s meant to guide you through some of the fundamental constructs and mechanisms of Rego, especially those that we’ve seen used a lot in the wild so that you can get a better intuitive understanding of how it all fits together and how to author larger, more advanced, policies. This is not aiming to be a complete (or even sizeable) reference, nor is it a Spacelift-specific guide. If you want to go along and play with the examples, the Open Policy Agent playground is the best place to do so.

Decisions

When talking about Rego, I think it’s best to start with decisions. Decisions in Rego are used as the output of policies but also as temporary variables all over the place. They don’t have to be true/false either – which is one of the common misconceptions! Decisions can be strings, arrays, objects, sets, etc. You can have as many decisions in your policy as you want.

For example, you can have a simple boolean decision with a constant,

allow = true
Enter fullscreen mode Exit fullscreen mode

or reference a different variable.

allow = accessible_by_admin and is_admin
Enter fullscreen mode Exit fullscreen mode

So basically normal variable (decision) assignment.

Rego also allows you to use block notation for assignments:

allow {
    accessible_by_admin := startswith(path, “/admin/”)
    accessible_by_admin
    is_admin
}
Enter fullscreen mode Exit fullscreen mode

where each line can be an assignment or true/false expression. The result of the whole block will be true only if the expression in each line evaluates to true, letting you build complex rules that use many other decisions and variables inside of them. We can check a bunch of predicates, and only if all checks succeed does the decision come out to be true.

You can also have a decision that’s an OR of two other decisions, like here:

is_admin {
    input.user.admin
}

is_endpoint_public {
    startswith(path, “/public/”)
}

allow {
    is_admin
}
allow {
    is_endpoint_public
}
Enter fullscreen mode Exit fullscreen mode

As you can see, a block for a decision can be repeated multiple times, and the decision will be true if any of the blocks evaluate to true.

Some

The above block is very simple and linear, but we can also do something slightly more complicated. Let’s say we have a request path and an array allowed_path_prefixes and want to check if the path matches any prefix. In that case, we can specify an additional variable for the index:

allow {
    some i
    startswith(request.path, allowed_path_prefixes[i])
}
Enter fullscreen mode Exit fullscreen mode

Now, the way this will intuitively work is that rego will choose an i, move to the second line, and if it fails, go back, try another i and so on until it finds an i that leads to all lines being true. If no such i is found, then allow will be null (technically speaking, it will be undefined).

In human terms, you can read this as “set allow to true if, for some value of i, the request path starts with the i’th allowed path prefix,” or, in even more human terms, “set allow to true if the request path starts with any of the allowed path prefixes.”

If you have more such variables, you can add them separated by commas next to the i:

some i, j, k, l, m, n, o, p
Enter fullscreen mode Exit fullscreen mode

Additionally, if this i is only needed in a single place, like in the above example, you can substitute it with an underscore instead:

allow {
    startswith(request.path, allowed_path_prefixes[_])
}
Enter fullscreen mode Exit fullscreen mode

We can also take a look at an example that’s closer to Spacelift, where we might want to decide if a commit is worth a Terraform execution or whether it should just be ignored. We get a list of files changed by the commit, and we also have a couple of paths that are of interest to Terraform. Here we’ll check if any of the paths changed starts with one of the interesting paths:

interesting_path_prefixes := [
  “/src/terraform”,
  “/src/modules”
]
track {
    startswith(input.affected_files[_], interesting_path_prefixes[_])
}
Enter fullscreen mode Exit fullscreen mode

We’re using two underscores in a single line to mean “is there any pair of (affected file, interesting path prefix) such that the file starts with the prefix.”

But wait, there’s no allow rule here? Is that a valid policy? Yes, it is! Policies can have arbitrary sets of decisions, with those decision being of arbitrary types. It’s just that allow based policies are one of the most obvious use cases, but the power of the Rego language extends much further and can be used for all kinds of decisions, as exemplified by the rich selection of policies available in Spacelift.

Sets

Decisions can also be specified as sets:

allowed_users := [“papaya”, “potato”]
allow[“papaya”] {
    “papaya” == allowed_users[_]
}
Enter fullscreen mode Exit fullscreen mode

which means “Papaya should be in the allow set if it belongs to the allowed_users list.”

Moreover, with this block notation, you can specify the element as a variable reference from the block itself:

allow[user] {
    user := input.user
    user == allowed_users[_]
}
Enter fullscreen mode Exit fullscreen mode

Using just a single block, you can even specify multiple elements of the set, by having multiple evaluation paths that successfully reach the end of the block and each path having a different user variable.

allow[user] {
    user := input.users[_]
    user == allowed_users[_]
}
Enter fullscreen mode Exit fullscreen mode

The value in the square brackets can actually be an arbitrary expression referencing the block’s variables, so if we have a policy whose decisions are warnings based on resources changed, we could do the following:

forbidden := {“expensive_resource”, “expensive_resource2”}
warn[sprintf(“You shall not use %s”, resource_name)] {
    resource_name := input.resources_changed[_].name
    forbidden[resource_name]
}
Enter fullscreen mode Exit fullscreen mode

which checks for forbidden resources and displays a pretty warning message if one of them is changed. In this example, you can also see the usage of a set containment check in the last line of the block.

Functions

A more advanced feature of rego is that it lets you define custom functions you can use as helpers in your policy. Writing functions is very similar to writing block decisions, but with some minor differences:

plus_custom(a, b) := c {
    c := a + b
}
out := plus_custom(42, 43)
Enter fullscreen mode Exit fullscreen mode

You can see we’re specifying a list of arguments, the variable that should be used as the output, and then have a normal block body. The output of the function will be c as long as the function successfully reaches the end of its body.

However, instead of that output variable, we could also, again, have an arbitrary expression, a constant, for instance:

bucket_is_secure(bucket) := true {
    not bucket.public
    bucket.encrypted
}
bucket := {"public": false, "encrypted": false}
out := bucket_is_secure(bucket)
Enter fullscreen mode Exit fullscreen mode

In this case, the function will evaluate to true if the bucket is not public and is encrypted.

Summary

All of this has just been a quick overview of the parts of Rego we’ve seen most commonly used. With these building blocks, you should be much better equipped to begin authoring your own policies, whichever project you’re using them with. If you want to learn more, the whole Rego policy language is much bigger, and the best place to dive in is the Official Open Policy Agent Policy Language documentation.

If you think Rego is cool and would like to play with a product where it’s put front-and-center, we’d love you to take Spacelift for a spin!

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .