AI for Web Devs: Deploying Your AI App to Production

Austin Gil - Feb 7 - - Dev Community

Welcome back to the series where we have been building an application with Qwik that incorporates AI tooling from OpenAI. So far we’ve created a pretty cool app that uses AI to generate text and images.

  1. Intro & Setup

  2. Your First AI Prompt

  3. Streaming Responses

  4. How Does AI Work

  5. Prompt Engineering

  6. AI-Generated Images

  7. Security & Reliability

  8. Deploying

Now, there’s just one more thing to do. It’s launch time!

I’ll be deploying to Akamai‘s cloud computing services (formerly Linode), but these steps should work with any VPS provider. If you don’t already have a hosting provider, you can sign up at linode.com/austingil to get $100 in cloud computing credits.

Let’s do this!

Setup Runtime Adapter

There are a couple of things we need to get out of the way first: deciding where we are going to run our app, what runtime it will run in, and how the deployment pipeline should look.

As I mentioned before, I’ll be deploying to a VPS in Akamai’s connected cloud, but any other VPS should work. For the runtime, I’ll be using Node.js, and I’ll keep the deployment simple by using Git.

Qwik is cool because it’s designed to run in multiple JavaScript runtimes. That’s handy, but it also means that our code isn’t ready to run in production as is. Qwik needs to be aware of its runtime environment, which we can do with adapters.

We can access see and install available adapters with the command, npm run qwik add. This will prompt us with several options for adapters, integrations, and plugins.

The resulting screen from  raw `npm run qwik add` endraw  command, showing the list of integrations.

For my case, I’ll go down and select the Fastify adapter. It works well on a VPS running Node.js. You can select a different target if you prefer.

Once you select your integration, the terminal will show you the changes it’s about to make and prompt you to confirm. You’ll see that it wants to modify some files, create some new ones, install dependencies, and add some new NPM scripts. Make sure you’re comfortable with these changes before confirming.

Qwik CLI screen showing all the proposed changes and asking for approval

Once these changes are installed, your app will have what it needs to run in production. You can test this by building the production assets and running the serve command. (Note: For some reason, npm run build always hangs for me, so I run the client and server build scripts separately).

npm run build.client & npm run build.server & npm run serve

This will build out our production assets and start the production server listening for requests at http://localhost:3000. If all goes well, you should be able to open that URL in your browser and see your app there. It won’t actually work because it’s missing the OpenAI API keys, but we’ll sort that part out on the production server.

Push Changes to Git Repo

As mentioned above, this deployment process is going to be focused on simplicity, not automation. So rather than introducing more complex tooling like Docker containers or Kubernetes, we’ll stick to a simpler, but more manual process: using Git to deploy our code.

I’ll assume you already have some familiarity with Git, and a remote repo you can push to. If not, please go make one now.

You’ll need to commit your changes and push it to your repo.

git commit -am "ready to commit 💍" & git push origin main
Enter fullscreen mode Exit fullscreen mode

Prepare Production Server

If you already have a VPS ready, feel free to skip this section. I’ll be deploying to an Akamai VPS. If you don’t already have an account, feel free to sign up at linode.com/austingil for $100 in free credits.

I won’t walk through the step-by-step process for setting up a server, but in case you’re interested, I chose the Nanode 1 GB shared CPU plan for $5/month with the following specs:

  • Operating system: Ubuntu 22.04 LTS

  • Location: Seattle, WA

  • CPU: 1

  • RAM: 1 GB

  • Storage: 25 GB

  • Transfer: 1 TB

Choosing different specs shouldn’t make a difference when it comes to running your app, although some of the commands to install any dependencies may be different. If you’ve never done this before, then try to match what I have above. You can even use a different provider, as long as you’re deploying to a server to which you have SSH access.

Once you have your server provisioned and running, you should have a public IP address that looks something like 172.100.100.200. You can log into the server from your terminal with the following command:

ssh root@172.100.100.200
Enter fullscreen mode Exit fullscreen mode

You’ll have to provide the root password if you have not already set up an authorized key.

We’ll use Git as a convenient tool to get our code from our repo into our server, so that will need to be installed. But before we do that, I always recommend updating the existing software. We can do the update and installation with the following command.

sudo apt update && sudo apt install git -y
Enter fullscreen mode Exit fullscreen mode

Our server also needs Node.js to run our app. We could install the binary directly, but I prefer to use a tool called NVM, which allows us to easily manage Node versions. We can install it with this command:

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
Enter fullscreen mode Exit fullscreen mode

And once NVM is installed, you can install the latest version of Node with:

nvm install node
Enter fullscreen mode Exit fullscreen mode

Note that the terminal may say that NVM is not installed. If you exit the server and sign back in, it should work.

Upload, Build, & Run App

With our server set up, it’s time to get our code installed. With Git, it’s relatively easy. We can copy our code into our server using the clone command. You’ll want to use your own repo, but it should look something like this:

git clone https://github.com/AustinGil/versus.git
Enter fullscreen mode Exit fullscreen mode

Our source code is now on the server, but it’s still not quite ready to run. We still need to install the NPM dependencies, build the production assets, and provide any environment variables.

Let’s do!

First, navigate to the folder where you just cloned the project. I used:

cd versus
Enter fullscreen mode Exit fullscreen mode

The install is easy enough:

npm install
Enter fullscreen mode Exit fullscreen mode

The build command is:

npm run build
Enter fullscreen mode Exit fullscreen mode

However, if you have any type-checking or linting errors, it will hang there. You can either fix the errors (which you probably should) or bypass them and build anyway with this:

npm run build.client & npm run build.server
Enter fullscreen mode Exit fullscreen mode

The latest version of the project source code has working types if you want to check that.

The last step is a bit tricky. As we saw above, environment variables will not be injected from the .env file when running the production app. Instead, we can provide them at runtime right before the serve command like this:

OPENAI_API_KEY=your_api_key npm run serve
Enter fullscreen mode Exit fullscreen mode

You’ll want to provide your own API key there in order for the OpenAI requests to work.

Also, for Node.js deployments, there’s an extra, necessary step. You must also set an ORIGIN variable assigned to the full URL where the app will be running. Qwik needs this information to properly configure their CSRF protection.

If you don’t know the URL, you can disable this feature in the /src/entry.preview.tsx file by setting the createQwikCity options property checkOrigin to false:

export default createQwikCity({
  render,
  qwikCityPlan,
  checkOrigin: false
});
Enter fullscreen mode Exit fullscreen mode

This process is outlined in more detail in the docs, but it’s recommended not to disable, as CSRF can be quite dangerous. And anyway, you’ll need a URL to deploy the app anyway, so better to just set the ORIGIN environment variable. Note that if you make this change, you’ll want to redeploy and rerun the build and serve commands.

If everything is configured correctly and running, you should start seeing the logs from Fastify in the terminal, confirming that the app is up and running.

{"level":30,"time":1703810454465,"pid":23834,"hostname":"localhost","msg":"Server listening at http://[::1]:3000"}
Enter fullscreen mode Exit fullscreen mode

Unfortunately, accessing the app via IP address and port number doesn’t show the app (at least not for me). This is likely a networking issue, but also something that will be solved in the next section, where we run our app at the root domain.

The Missing Steps

Technically, the app is deployed, built, and running, but in my opinion, there is a lot to be desired before we can call it “production ready.” Some tutorials would assume you know how to do the rest, but I don’t want to do you like that. We’re going to cover:

  • Running the app in background mode

  • Restarting the app if the server crashes

  • Accessing the app at the root domain

  • Setting up an SSL certificate

One thing you will need to do for yourself is buy the domain name. There are lots of good places. I’ve been a fan of Porkbun and Namesilo. I don’t think there’s a huge difference for which registrar you use, but I like these because they offer WHOIS privacy and email forwarding at no extra charge on top of their already low prices.

Before we do anything else on the server, it’ll be a good idea to point your domain name’s A record (@) to the server’s IP address. Doing this sooner can help with propagation times.

Now, back in the server, there’s one glaring issue we need to deal with first. When we run the npm run serve command, our app will run as long as we keep the terminal open. Obviously, it would be nice to exit out of the server, close our terminal, and walk away from our computer to go eat pizza without the app crashing. So we’ll want to run that command in the background.

There are plenty of ways to accomplish this: Docker, Kubernetes, Pulumis, etc., but I don’t like to add too much complexity. So for a basic app, I like to use PM2, a Node.js process manager with great features, including the ability to run our app in the background.

From inside your server, run this command to install PM2 as a global NPM module:

npm install -g pm2
Enter fullscreen mode Exit fullscreen mode

Once it’s installed, we can tell PM2 what command to run with the “start” command:

pm2 start "npm run serve"
Enter fullscreen mode Exit fullscreen mode

PM2 has a lot of really nice features in addition to running our apps in the background. One thing you’ll want to be aware of is the command to view logs from your app:

pm2 logs
Enter fullscreen mode Exit fullscreen mode

In addition to running our app in the background, PM2 can also be configured to start or restart any process if the server crashes. This is super helpful to avoid downtime. You can set that up with this command:

pm2 startup
Enter fullscreen mode Exit fullscreen mode

Ok, our app is now running, and will continue to run after a server restart. Great!

But we still can’t get to it. Lol!

My preferred solution is using Caddy. This will resolve the networking issues, work as a great reverse proxy, and takes care of the whole SSL process for us. We can follow the install instructions from their documentation and run these five commands:

sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https curl
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee /etc/apt/sources.list.d/caddy-stable.list
sudo apt update
sudo apt install caddy
Enter fullscreen mode Exit fullscreen mode

Once that’s done, you can go to your server’s IP address and you should see the default Caddy welcome page:

Default Caddy homepage with instructions on how to get started.

Progress!

In addition to showing us something is working, this page also gives us some handy information on how to work with Caddy.

Ideally, you’ve already pointed your domain name to the server’s IP address. Next, we’ll want to modify the Caddyfile:

sudo nano /etc/caddy/Caddyfile
Enter fullscreen mode Exit fullscreen mode

As their instructions suggest, we’ll want to replace the :80 line with our domain (or subdomain), but instead of uploading static files or changing the site root, I want to remove (or comment out) the root line and enable the reverse_proxy line, pointing the reverse proxy to my Node.js app running at port 3000.

versus.austingil.com {
        reverse_proxy localhost:3000
}
Enter fullscreen mode Exit fullscreen mode

After saving the file and reloading Caddy (systemctl reload caddy), the new Caddyfile changes should take effect. Note that it may take a few moments before the app is fully up and running. This is because one of Caddy’s features is to provision a new SSL certificate for the domain. It also sets up the automatic redirect from HTTP to HTTPS.

So now if you go to your domain (or subdomain), you should be redirected to the HTTPS version running a reverse-proxy in front of your generative AI application which is resilient to server crashes.

How awesome is that!?

Using PM2 we can also enable some load-balancing in case you’re running a server with multiple cores. The full PM2 command including environment variables and load-balancing might look something like this:

OPENAI_API_KEY=your_api_key ORIGIN=example.com pm2 start "npm run serve" -i max
Enter fullscreen mode Exit fullscreen mode

Note that you may need to remove the current instance from PM2 and rerun the start command, you don’t have to restart the Caddy process unless you change the Caddy file, and any changes to the Node.js source code will require a rebuild before running it again.

Hell yeah! We did it!

Alright, that’s it for this blog post, and this series. I sincerely hope you enjoyed both and learned some cool things. Today, we covered a lot of things you need to know to deploy an AI-powered application:

  • Runtime adapters

  • Building for production

  • Environment variables

  • Process managers

  • Reverse-proxies

  • SSL certificates

If you missed any of the previous posts, be sure to go back and check them out.

  1. Intro & Setup

  2. Your First AI Prompt

  3. Streaming Responses

  4. How Does AI Work

  5. Prompt Engineering

  6. AI-Generated Images

  7. Security & Reliability

  8. Deploying

I’d love to know what you thought about the whole series. If you want, you can play with the app I built at versus.austingil.com. Let me know if you deployed your own app, and I’ll link to it from here. Also, if you have ideas for topics you’d like me to discuss in the future I’d love to hear them :)

UPDATE: If you liked this project and are curious to see what it might look like as a SvelteKit app, check out this blog post by Tim Smith where he converts this existing app over.

Thank you so much for reading. If you liked this article, and want to support me, the best ways to do so are to share it, sign up for my newsletter, and follow me on Twitter.


Originally published on austingil.com.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .