One of the most common questions I get is “ How do I build a multi-tenant application with AppSync and Cognito? ”.
If you google this topic on the internet you will no doubt come across many different opinions. It’s a topic that we’ll soon explore in the AppSync Masterclass but I want to take this opportunity to explain my thoughts on it.
You see, a common requirement in these multi-tenant applications is to support roles within each tenant. These are usually well-defined roles in your application and a user would fall into one of these roles within his/her tenant.
So you not only have to isolate data access by the tenant but also restrict access to certain operations by role. Users atTenant 1
can only access Tenant 1
’s data. Furthermore, a ReadOnly
user at Tenant 1
cannot modify any of its data, and only SuperUser
and Admin
users are allowed to manage Tenant 1
’s users.
My preferred way of accomplishing all this is to:
- Model the roles as Cognito groups.
- Model the tenants as Cognito attributes.
- Never accept
tenantId
as an argument in the GraphQL schema.
Let me explain.
Model tenant as Cognito attributes
To scope every request to a particular tenant, you need to get the tenant ID from somewhere.
Assuming that you’re using AppSync with Cognito, then agood place to do this is to capture the tenant ID as a Cognito custom attribute. This way, the tenant ID would be available in the $context.identity.claims
object and is available to both VTL templates as well as Lambda resolvers.
Since the tenant ID is coming from Cognito, you can trust it hasn’t been tempered with. However, you still need to ensure the correct value is set in the first place. For instance, a malicious user from tenant A can’t register a new user with tenant B’s tenant ID.
To protect yourself against this attack vector, you can set AllowAdminCreateUserOnly
to true
.
UserPool:
Type: AWS::Cognito::UserPool
Properties:
AdminCreateUserConfig:
AllowAdminCreateUserOnly: true
...
This way, when a new tenant is created, your backend (maybe a Lambda function?) would also create an admin user for this tenant and set the user’s tenant ID accordingly. From then on, this admin user can register other users by talking to your AppSync API instead of calling Cognito directly.
For example, you might have an addUser
mutation like this:
type Mutation {
addUser(name: String!, email: AWSEmail!, role: Role!): User
...
}
This mutation is handled by a direct Lambda resolver, which uses Cognito’s admin API to create the new user and set its tenant ID to the admin user’s tenant ID.
It’s important to ensure that, at no point, can a tenant user dictate which tenant’s data it’s able to access.
Which is why you should never take tenant ID as a request argument. The tenant ID used in all your data access operations (e.g. DynamoDB read and writes) needs to come from Cognito.
“Hey, Yan, how do I make sure that only Admin and SuperUser roles can create new users?”
That’s where roles and Cognito groups come in.
Model roles as Cognito groups
AppSync has an awesome integration with Cognito groups, which lets you specify which users are allowed to perform which GraphQL operations.
If only Admin
and SuperUser
users can manage a tenant’s users, then you can restrict access to the addUser
mutation using the @aws_auth
directive.
type Mutation {
addUser(name: String!, email: AWSEmail!, role: Role!): User
@aws_auth(cognito_groups: ["Admin", "SuperUser"])
...
}
This makes Cognito groups a natural way to model roles within your application and use them to restrict access to certain operations.
“But how do I prevent privilege escalation? Like, if a SuperUser decides to give himself admin permissions by creating a new admin user…”
Great question! Sometimes using the @aws_auth
directive alone just isn’t enough. You sometimes need to do additional validation in the request VTL template or in the Lambda resolver’s body.
In this particular case, since the operation involves calling Cognito’s admin API, you’re probably using a direct Lambda resolver. In which case, you can see what groups the caller belongs to in event.identity.groups
and err if a SuperUser
user attempts to create an Admin
user.
Tools like Lumigo makes it easy for you to see the invocation event for your Lambda functions without having to litter your code with trace logging. Your functions are auto-instructed and you can quickly debug any issues that come up or help you develop your application faster.
Wrap up
As I mentioned at the start, my preferred way of building multi-tenant applications with AppSync and Cognito is to:
- Model the roles as Cognito groups.
- Model the tenants as Cognito attributes.
- Never accept
tenantId
as an argument in the GraphQL schema.
This approach is simple and has worked for me time and time again.
But sometimes, you have more complex use cases that this approach cannot accommodate. For example, I recently implemented a custom IAM system to cater for an app where users can have different roles at multiple organizations within a hierarchy. And the access a user has depends on both the roles it holds at those organizations but also the roles it inherits from the hierarchy.
From the above example, this user will have the following permissions:
-
ReadOnly
atParent Org
,Child Org 2
andGrandchild Org
-
Admin
atChild Org 1
If you’re interested in reading about how we build this system (and why!) then come back next week for my next update.
And if you want to learn more about AppSync and GraphQL, then check out my video course – the AppSync Masterclass – and save 30% while we’re still in early access!