The video for this tutorial is also available on YouTube
In GraphQL and Graph Databases, Paul Wilton gives an overview of the benefits of combining a Graph database with GraphQL, and how you might align your GraphQL schemas with the same ontology models that describe the data in a graph represented by a graph database.
Willian Lyon has also championed the idea of the Grand Stack which combines technologies like React and Apollo with GraphQL and the Neo4j graph database.
As an AWS person, I became really interested in how I may take advantage of a an AppSync GraphQL API backed by a graph database. There are many great options to choose from, including things like Neo4j and ArangoDB which I hope to also try out sometime soon, but for this build I chose to use Amazon Neptune.
Neptune is a fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. The core of Amazon Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with milliseconds latency.
In this tutorial, I'll walk through how to build a Neptune-backed AppSync GraphQL API with AWS CDK, AWS AppSync, and AWS Lambda.
We will be using AWS Lambda direct resolvers to write the business logic for the API, which will be interacting with Amazon Neptune via Websockets using gremlin
The code for this project is located here
Prerequisites
- AWS Account
- CDK configured on your local machine
Getting started
To get started, we'll be using the CDK CLI to initialize a new project. To do so, create a new empty folder and initialize a new CDK project in TypeScript:
cdk init --language=typescript
Next, open tsconfig.json and set noImplicitAny
to false:
"noImplicitAny": false,
Now let's install the dependencies we'll need to create the infrastructure using either npm or yarn:
yarn add @aws-cdk/aws-appsync @aws-cdk/aws-lambda @aws-cdk/aws-ec2 @aws-cdk/aws-neptune
Next, we'll be working in the lib/your-project-name-stack.ts file to create our stack.
To get started, first go ahead and import the CDK classes and constructs that we'll be using:
// lib/your-project-name-stack.ts
import * as cdk from '@aws-cdk/core'
import * as appsync from '@aws-cdk/aws-appsync'
import * as lambda from '@aws-cdk/aws-lambda'
import * as ec2 from '@aws-cdk/aws-ec2'
import * as neptune from '@aws-cdk/aws-neptune'
Note - We are only importing
ec2
to create a VPC that we will be placing both our function as well as our Neptune instance in.
Creating the GraphQL API
Let's now create the GraphQL API. To do so, add the following lines of code to the stack, below the call to super
:
const api = new appsync.GraphqlApi(this, 'Api', {
name: 'NeptuneAPI',
schema: appsync.Schema.fromAsset('graphql/schema.graphql'),
authorizationConfig: {
defaultAuthorization: {
authorizationType: appsync.AuthorizationType.API_KEY
},
},
})
Here, we've created an API named NeptuneAPI
and have set some basic configuration, including the location for the GraphQL schema located at graphql/schema.graphql.
Next, go ahead and create a new folder named graphql at the root directory and add a file within it named schema.graphql. Here, add the following schema:
type Post {
id: ID!
title: String!
content: String!
}
input PostInput {
title: String!
content: String!
}
type Query {
listPosts: [Post]
}
type Mutation {
createPost(post: PostInput!): Post
}
type Subscription {
onCreatePost: Post
@aws_subscribe(mutations: ["createPost"])
}
The API will be a pretty basic Blog API, allowing us to create and query for Posts.
Lambda and VPC
Next, below the API code, create the VPC and the Lambda function:
const vpc = new ec2.Vpc(this, 'NeptuneVPC')
const lambdaFn = new lambda.Function(this, 'Lambda Function', {
runtime: lambda.Runtime.NODEJS_14_X,
handler: 'main.handler',
code: lambda.Code.fromAsset('lambda-fns'),
memorySize: 1024,
vpc
})
We've set some basic configuration around the Lambda, including the runtime, memory size, and the location of the entry code (lambda-fns) and the handler (main).
Next, we will add the Lambda function as a datasource to the AppSync API and create the resolvers for the query and mutation we defined in the GraphQL schema. Add the following code below the Lambda function definition:
const lambdaDs = api.addLambdaDataSource('lambdaDatasource', lambdaFn);
lambdaDs.createResolver({
typeName: "Query",
fieldName: "listPosts"
})
lambdaDs.createResolver({
typeName: "Mutation",
fieldName: "createPost"
})
Creating the Neptune database
The last thing we need to do is create the Neptune database. We will also be getting a reference to the read and write endpoints to make them available as environment variables so we can reference them in our Lambda function:
const cluster = new neptune.DatabaseCluster(this, 'NeptuneCluster', {
vpc,
instanceType: neptune.InstanceType.R5_LARGE
})
cluster.connections.allowDefaultPortFromAnyIpv4('Open to the world')
const writeAddress = cluster.clusterEndpoint.socketAddress;
const readAddress = cluster.clusterReadEndpoint.socketAddress
lambdaFn.addEnvironment('WRITER', writeAddress)
lambdaFn.addEnvironment('READER', readAddress)
// The next two lines are not required, they just log out the endpoints to your terminal for reference
new cdk.CfnOutput(this, 'readaddress', {
value: readAddress
})
new cdk.CfnOutput(this, 'writeaddress', {
value: writeAddress
})
Adding the Lambda function code
When we created the Lambda function, we referenced code located in the lambda-fns directory but we have not written that code just yet.
To get started doing so, create the directory, initialize a new package.json file, and install gremlin.
mkdir lambda-fns
cd lambda-fns
npm init --y
yarn add gremlin
cd ..
Next, create the following 4 files in the lambda-fns directory:
- Post.ts
- main.ts
- createPost.ts
- listPosts.ts
Let's create the code for each of these files:
Post.ts
type Post = {
id: string;
title: string;
content: string;
}
export default Post
createPost.ts
const gremlin = require('gremlin')
import Post from './Post'
const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection
const Graph = gremlin.structure.Graph
const uri = process.env.WRITER
async function createPost(post: Post) {
let dc = new DriverRemoteConnection(`wss://${uri}/gremlin`, {})
const graph = new Graph()
const g = graph.traversal().withRemote(dc)
const data = await g.addV('posts').property('title',post.title).property('content', post.content).next()
post.id = data.value.id
dc.close()
return post
}
export default createPost
listPosts.ts
const gremlin = require('gremlin')
const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection
const Graph = gremlin.structure.Graph
const uri = process.env.READER
const listPosts = async () => {
let dc = new DriverRemoteConnection(`wss://${uri}/gremlin`, {})
const graph = new Graph()
const g = graph.traversal().withRemote(dc)
try {
let data = await g.V().hasLabel('posts').toList()
let posts = Array()
for (const v of data) {
const _properties = await g.V(v.id).properties().toList()
let post = _properties.reduce((acc, next) => {
acc[next.label] = next.value
return acc
}, {})
post.id = v.id
posts.push(post)
}
dc.close()
return posts
} catch (err) {
console.log('ERROR', err)
return null
}
}
export default listPosts
main.ts
import createPost from './createPost';
import listPosts from './listPosts';
import Post from './Post';
type AppSyncEvent = {
info: {
fieldName: string
},
arguments: {
post: Post
}
}
exports.handler = async (event:AppSyncEvent) => {
switch (event.info.fieldName) {
case "createPost":
return await createPost(event.arguments.post);
case "listPosts":
return await listPosts();
default:
return null;
}
}
In main.ts we are switching over the event.info.fieldname
which will be the GraphQL query or mutation that is triggering the function, and invoking a function based on the field name.
Deploying and testing
Now that we have finished writing the code, we are ready to deploy and test it out. To do so, run a build and then deploy:
npm run build && cdk deploy
Once the deployment is complete, you should be able to test it out. To do so, visit the AppSync Console and click on Queries in the left hand menu.
Execute the following queries to create and query for data from Neptune:
query listPosts {
listPosts {
id
title
content
}
}
mutation createPost {
createPost(post: {
content:"Hello world"
title: "My first post!!"
}) {
id
title
content
}
}
Conclusion
This introduction was in no way meant to be a deep dive or guide on querying and traversing Neptune or on how to properly deal with data in a Graph database, instead it was meant to show you how to put all of the pieces together in order to get the infrastructure and APIs set up so you can get started with this stack.
To learn more, I'd dive deeper into the documentation for gremlin, Neptune, and general guides on how to properly work with graph databases.
The code for this project is located here