Cruddur - The not so great twitter clone using Ruby Sinatra + React + GCP CloudRun + MongoDB Atlas + Terraform

Andrew Brown 🇨🇦 - Dec 8 '22 - - Dev Community

Hackathon Entry

Hey is Andrew Brown 👋 and I entering the MongoDB Atlas hackathon!

  • Will I win? I doubt it
  • have I learned something new? Quite a bit

The repo exists here. I built everything as I went:
https://github.com/omenking/mongodb-atlas-gcp-microblog

I did attempt to provide step-by-step tutorial instructions in the /docs directory, but it got a bit squirrelly near the end.

What does the app do?

Not a whole alot.

You can:

  • Write a microblog post that will show up on main feed
  • View a home feed
  • View a feed of a specific user's microblog
  • Search microblogs using the search (but only through the API not the UI lol)

You can't:

  • Login or Signup
  • Use mentions or hashtags
  • Reshare with comment
  • Reply
  • Followerships

So in terms of useful app, its not that great... 😂

I was more interested in the cloud infrastructure and showing folks how to solve deploying an app to GCP Cloud Run with a MongoDB Atlas backed database.

What was the tech stack?

I did all the development using Gitpod.

Image description

Time and Effort

I built this application over seven non-consecutive days.

My team (ExamPro) helped me out in a few areas:

  • solving React syntax
  • debugging networking issues on GCP
  • debugging CORS issues
  • debugging container env var issues
  • evaluating the mongodb driver
  • figuring on MongoDB Altas search (its not fully documented for ruby yet)
  • Navigating MongoDB Altas UI for specific access controls.

If I had to breakdown time spent it would look something like this:

hours tasks
4 Building the Sinatra app
6 Building the react app
2 Figuring our docker containerization
1 Pushing containers to Artifact Registry
2 Deploying containers to Cloud Run via Terraform
5 Troubleshooting and configuring CORS
6 Troubleshooting Custom Domain with Google-managed certificate
5 Troubleshooting Connectivity to the container through GCP Load Balancer (Classic)
9 Troubleshooting Terraform GCP Load Balancer Module
1 Writing MongoDB integration
41 Total Hours

Why did you use Ruby and Sinatra?

I love ruby, full stop.

I chose Sinatra because I wanted a very simple framework so I can share this project with beginners.

A more complex framework like Grape or Rails has a-lot of magic going and so a beginner might not have confidence of what is actually happening.

A lightweight framework I think is more suited for the future technical path of containerized applications which has micro-services and event driven architecture (EDA) in mind.

Why did you choose React as the frontend framework?

I absolutely hate React. I would have much preferred to usemithril.js or maybe Vue, but I thought I should use a popular framework.

Next.js did cross my mind, but there is considerable opinion in that framework.

So the React implementation I used:

  • functional components (because classes scares some folks)
  • plain js (since Typescript scares some folks)
  • No Redux (since I don't want a headache)
  • create-react-app because it made setting up the boilerplate code fast
  • React Router v6 because I assume this the most popular.

Why did you choose GCP and GCP Cloud Run?

I am a Google Cloud Champion Innovator for the Modern Architecture category so it a was great opportunity to create some modern application content.

For compute it could have been App Engine, or Cloud Functions or Google Kubernetes Engine (GKE) but the reason I choose Cloud Run was because it has built in CI/CD.

I never did use Cloud Run's CI/CD functionality in this project, since the time ran out troubleshooting various issues which will discuss.

Why did you choose Terraform?

Well I did use the Google Cloud CLI (gcloud) to provision repositories in the Artifact Registry, and did Click Ops in the console for managing the domain name in Cloud DNS.

For everything else I used Terraform. GCP has its own IaC called Cloud Deployment Manager which has its own template language via YAML files but as far as I know, nobody likes using it, and GCP supports Terraform more than their own tool in their documentation.

  • With AWS I normally use CloudFormation
  • With Azure I normally use Azure Bicep

But yeah, GCP just go with Terraform.

Why did you use Gitpod for your developer environment?

Gitpod is a Cloud Developer Environment (CDE), which is basically VSCode in your browser.

Using CDE made it really easy for my team members to jump in and help me places. All they had to do was press a button, and the could get to work.

There are other options out there like AWS Cloud9, Github Codespaces, but these options utilized a Virtual Machine (VM) as the attached environment.

Gitpod uses docker (which is managed by K8s) so its much faster to launch an environment, you don't have to "rebuild" an environment, its easy to discard a state back to a clean working version in case you jank the environment.

I'm also Gitpod Community Hero (because I really like Gitpod).

Why use MongoDB Ruby Driver instead of Mongoid?

Mongoid is a The Official Ruby Object Mapper for MongoDB. Its been around for a long time and makes it really easy working with MongoDB.

We didn't use it, because I know using DynamoDB's gem Dynamoid that these libraries while convenient get in the way of using advanced features or obscures fine-tuning of queries.

Alex Debrie is his DynamoDB Book strongly advises against using any kind of ORM or ObjectMapper for DynamoDB and the same here applies to MongoDB.

I've known out ORMs get in the way even when using Postgres, outing to just write simple Plain Ruby Objects along with raw SQL.

Since we wanted to use MongoDB Atlas Search and were thinking about using Change Streams, Mongoid wasn't going to support these.

Using the MongoDB Ruby Driver is quite straight forward.

Installing GCloud

GCloud is very well built CLI. Installing it is headache lol mostly due to how the docs are written.

Other providers you can just copy a block a text and go, but GCP, you have to walk through all the instructions and pick out your scenario.

To make life easier this is all the step you generally need for Ubuntu/Debian.

sudo apt-get install apt-transport-https ca-certificates gnupg -y
echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | sudo tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key --keyring /usr/share/keyrings/cloud.google.gpg add -
sudo apt-get update && sudo apt-get install google-cloud-cli -y
Enter fullscreen mode Exit fullscreen mode

GCloud also takes longer to install then other Cloud CLIs, and so I could not create a gitpod.yml file to just install it in my environment on launch because the long install time would cause a timeout.

So I spent multiple times installing it, again and again on fresh environments.

Commands I ended up memorizing since I ran them so many times were:

Logging into google cloud

gcloud init
Enter fullscreen mode Exit fullscreen mode

Login required for deploying to Google Cloud Run

gcloud auth application-default login
Enter fullscreen mode Exit fullscreen mode

Authenticating to push to container images to Artifact Registry

gcloud auth configure-docker <region>-docker.pkg.dev
Enter fullscreen mode Exit fullscreen mode

Pushing to GCP Artifact Registry

On other providers like AWS and Azure they give you one click copy and paste instructions to push to their respected managed container repository services.

Not GCP lol. You have to figure out their docs, and its not a simple copy and paste.

First you need authenicate so you can push to Artifact Registry:

gcloud auth configure-docker us-east1-docker.pkg.dev
Enter fullscreen mode Exit fullscreen mode

Then you need to tag your build image with the artifact registry repo address:

docker tag cruddur-app us-east1-docker.pkg.dev/cruddur/backend-sinatra/backend-sinatra:latest
Enter fullscreen mode Exit fullscreen mode

Then you push

docker push us-east1-docker.pkg.dev/cruddur/backend-sinatra/backend-sinatra:latest
Enter fullscreen mode Exit fullscreen mode

Renaming Build Image for Docker Compose

Instead of building images indivually I would build them via docker compose:

docker compose build
Enter fullscreen mode Exit fullscreen mode

However, docker compose will use your folder name (basically your project repo name) as the prepended name.

My repo name is mongodb-atlas-gcp-microblog and a container name in the docker-compose.yml file is frontend-react so I'd end up in an image name:

mongodb-atlas-gcp-microblog-frontend-react

which is really long. So if you use the -p flag you can override the prepend name

So the following:

docker compose -p cruddur build
Enter fullscreen mode Exit fullscreen mode

Will then produce this as container image name:

cruddur-frontend-react
Enter fullscreen mode Exit fullscreen mode

Google Cloud Run Port requirements

So I (re)discovered that Cloud Run expects your container to listen on port 8080.

So in the Dockerfile I had to figure out how to pass along Environment Variables to this file

FROM ruby:3.1.0

# sets the default port (it can still be overriden)
ENV PORT=4567

ADD . /app
WORKDIR /app
RUN bundle install
EXPOSE ${PORT}
CMD [ "sh", "-c", "bundle exec rackup --host 0.0.0.0 -p $PORT"]
Enter fullscreen mode Exit fullscreen mode

We could achieve that wi the ${PORT}.

Notice I set ENV instead of ARG, ARG only works for the build, and I wanted this Environment Variable to persist when the container was running.

Also I had to add CMD [ "sh", "-c", otherwise the environment variable would have not been interpreted in the CMD command. It would just show up blank.

Notice I am doing this:

"bundle exec rackup --host 0.0.0.0 -p $PORT"
Enter fullscreen mode Exit fullscreen mode

And not this:

"bundle exec rackup --host 0.0.0.0 -p ${PORT}"
Enter fullscreen mode Exit fullscreen mode

The former is just me reading the env var from the environment where the latter is actual interpolation in the template.

CORS, CORS, CORS!

Once I attempted to have the frontend and backend talking to each other I ran into CORS, as always.

So I installed sintra-cors. There were a few different cors gems to choose from but this one was dead simple.

Wildcarding on part of the domain eg. .gitpod.io was not working, so I had to pass the full domains for both services to resolve cors.

So when you deploy to Google Cloud Run they generate a endpoint URL so you can access the site.

I thought I could get this endpoint url via an environment variables that might get set by default by Cloud Run so that I could whitelist these endpoints for CORS.

This is not the case. I thought maybe there could been some very convoluted way using the Google SDK to try and get the endpoint URL dynamically but, there was a race case of collecting all the needed endpoints at different start times.

Again, I could not wildcard on part of the domain, but honestly thats a bad practice since I don't need all of Google Cloud Run endpoints urls allow to bypass CORS.

So That meant I would need a custom domain, and so we'd need a GCP Load Balancer.

Custom Domain with Load Balancer

So I already had my domain registered on Amazon Route53.

Google Cloud can generate an SSL for you with Google-Managed SSL. While I attempted to point an A record to the GCP Load Balancer while the hosted zone was managed by Amazon Route53, the SSL certificate was failing to provision.

I don't know if this solved it, but I updated the domain name servers to google and then use Cloud DNS to use the A Record to point the GCP Load Balancer and this worked to generate an SSL certificate for Google-Managed SSL Certificates.

I found that it took 30 minutes for the SSL Certificate to generate, but you have to remember to point the A record to the load balancer or it will say it failed. You don't need to restart the certificate process, it would figure it out after a few minutes.

Provisioning the GCP Load Balancer with Terraform

There is a GCP Terraform module for setting up a load balancer and all of its components.

There are examples on the Terraform Registry website, and there were different versions so I had to find an example with 5.1 version for multi-backend with pathing.

I don't know if this is module managed by Terraform or GCP but it could use more documentation, but I could figure things out for the most part by just looking at various examples.

I forgot this option when piecing different examples together:

  create_url_map    = false
Enter fullscreen mode Exit fullscreen mode

And so this caused there to be two load balancer, once pointing to just my API and another with both my backends with no frontends.

Phantom CORS issues

I thought I had the CORS issues behind me, but they started happening again.

We eventually discovered that the MONGO_ATLAS_URL environment variable was not being set. Why this through a CORS issues, I don't know, but once we ensure the env var was being set and passed to the container no more CORS issues.

MongoDB Local Development Skipped

I was thinking of using in my docker-compose.yml a local container of MongoDB before using MongoDB Atlas directly.

However I realized we would have to use a direct connection to a MongoDB Atlas database because (as far as I know) there is no local container that runs these more exotic features of MongoDB Atlas such as search.

MongoDB Atlas UI

It was straight forward.

I did have to hunt down in the UI, Database Access to reset the database user password.

For Network Access, I wasn't sure if I had to whitelist the GCP Load Balancer, like, would the Cloud Run containers IP address be the GCP Load Balancer IP address, so instead I just said, Allow Access from Anywhere.

MongoDB Atlas Search

Before you can create an index for search you need to populate data in your database.

To setup the index I just click straight though all the configuration options.

The documentation for MongoDB Atlas Search for Ruby was incomplete (according Bayko my co-founder) so he had to I guess look at the API Specification or dig through the MongoDB Driver.

He had to use $search option along with the aggregate option:

      def search_document collection, document
        attrs = [{
          '$search': {
            'index': 'default',
            'text': {
              'query': document,
              'path': 'message'
            }
          }
        },{ 
          '$limit': 10 
        }]
        result = []
        collection.aggregate(attrs).each do |document|
          result.push document
        end
        return result
      end
Enter fullscreen mode Exit fullscreen mode

Data Modelling for MongoDB

Uhh... We didn't have to do anything special?

With DynamoDB even for the simplest of tables I have to plan GSI, LSI, the partition key and sort key.

With MongoDB we just dump data in, and it worked. No thought or plan involved.

Maybe if we had replies, followerships, shares, things like that then Data Modeling would have been something we would have had to consider more.

GCP Load Balancer "Classic"

Using the GCP Terraform Module deploys the Classic version of the GCP Load Balancer. I couldn't figure out the difference between Classic and current LB.

I don't think I want to be using Classic but I didn't want to spend the time setting up all the individual resources in the terraform file.

  • Classic in AWS for Load Balancers is something not recommend for use anymore
  • Classic in Azure for Load Balancers is just an lighter alternate to Azure Front Door

In GCP I don't know, but GCP does like to sunset their products, so I don't really want to be on Classic lol.

Conclusion

The easiest part of this entire project was MongoDB Altas and the MongoDB API.

So much time was taken up just deploying and troubleshooting multiple containers and the Cloud Service Providers (CSPs) cloud infrastructure.

But honestly thats the story for any CSPs, whether it's AWS, Azure or GCP.

GCP Cloud Run by far is the easiest serverless container offering from a 1st tier CSP.

GCP is really great, their docs just need a bit of work, and their Terraform modules needs a bit of love.

MongoDB gets DX great as always.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .