The rise and acceleration of AI experiences across the web is seeing an adoption curve unlike any technology we’ve seen in years. AI driven experiences have raised the bar in UX expectations and is offering new frontiers for serving customers. Traditional static design and rigid input experiences are going to become a sign of outdated experiences on the web. As we’re helping our millions of developers embrace these new capabilities, this guide will start your journey of building AI experiences on Netlify.
TL;DR
We’re going to discuss the high-level approaches to building AI experiences on Netlify. You’ll walk away knowing how to start your AI journey, what parts of AI workloads Netlify can support, and considerations to keeping things fast and secure. Along the way, we'll step through some examples that you can code along with to get started.
I got 99 problems and AI should solve some
The first consideration in this space is to ensure you have a problem that’s right for AI to solve. Of course, those that are just exploring the tech and APIs this isn’t necessary. But if you’re building for your business/community, it’s important to start with the problem. AI is part of the solution (to a lot of problems) but it can be a challenge to start with the solution vs starting with the right problems. Starting with the right problem allows you to narrow in on which tools to use, provides clearer ways of evaluate the solutions, and reduces the risk of forcing a solution onto the wrong problem.
Here are some usecases/problems our customers solve with AI on Netlify today.
- AI-driven chat agents to help their customer workloads
- Generative UI to provide more personalized experiences
- AI Agents/Assistants trained to solve key tasks on behalf of users
- Publishing/hosting websites/artifacts generated by AI tools
- Semantic search of knowledgebases to make it easier to get insights and solve issues
- Storing and processing the data needed to create embeddings and push to vector databases
- Generating or modifying images, content, etc. to personalize their customer experiences
Did you know?
Cognition Labs, creators of the AI Software Engineer, Devin, faced challenges with needing to be able to run, host, and validate web experiences created by their AI engineer.The team chose Netlify to manage all of the web infrastructure that Devin produces during its development.
Composable AI stacks
With a usecase and concept in mind, you’ll want to get to building on your stack as quickly as possible. Netlify’s composable platform let’s your team decide the right combination of tools to use to achieve your goals.
UI / Framework stack
You probably already have a framework of choice and a website you’re building on. If you don’t, here are some templates to get you started with a site on Netlify.
- Remix
- Next.js (with App Router)
- … check out our docs for your UI framework of choice
For most usecases, AI solutions tend to be a server/backend concern. It’s often treated as an API layer separate from your frontend stack or integrated into it. That is to say, like your site’s other server concerns, your AI layer will usually look the same and sit next to the rest.
For example, if you’re building Generative UI via React Server Components, you are already integrating your server logic next to your components. Most websites today utilize a more trigger oriented flow - user interacts with UI, an API call happens, and UI is updated to reflect the information. Both approaches are supported on Netlify so you can use the approach that fits with your current site.
Any Model. Any AI Pattern. Any AI Framework.
Within your website architecture, you can decide which model and provider you would like to use. Netlify’s serverless runtime allows you to start working with AI without worrying about the infrastructure that powers it.
Access any AI model from a provider of your choosing. Here’s some that you might be interested in for your usecases.
- OpenAI - powerful proprietary models
- Groq - extremely fast LLM inference
- HuggingFace - extensive library of open models
- Anthropic - exceptional proprietary models
- OctoAI - containerized access to models and inference stack
- Google’s Vertex AI - powerful proprietary and open model access
- countless more!
Depending on your usecases, you’ll employ various AI patterns. Whether you’re just using basic direct inference, RAG (Retrieval Augmented Generation), or supporting something more complex such as knowledge graphs. These patterns combine compute, data, and inference to achieve results and Netlify is the glue and runtime to make this happen.
As your usecases get more complex, it’s a good idea to start reaching for a purpose-built AI framework to simplify what you need to understand.
- LangChain - A complete framework for orchestrating different patterns, normalizing provider differences, and making it easy to build and maintain AI systems without needing to know the nuances.
- AI SDK - Normalizing the responses from AI tools, common utilities, and prebuilt usecase methods.
- LlammaIndex - A data framework for orchestrating the data workloads used in your AI systems.
If you’re needing a vector database to do some advanced patterns, Netlify supports connecting to all of the major providers such as Pinecone, PlanetScale, Supabase, MongoDB, Neo4J, etc. Just select the tools that fits your needs.
Using Anthropic for product descriptions
Let’s apply some of this approach to see how simple it is to build AI experiences on Netlify. If we needed an API that generates descriptions of products, how would we build that? In this example, we will create an AI endpoint that runs on the edge and uses Anthropic’s Claude model to generate product descriptions.
Step 1
Set up an Anthropic account and create an API key.
Step 2
In your site settings create an environment variable for ANTHROPIC_API_KEY
Step 3
In your codebase, create a Netlify Edge Function, that will your endpoint. In our case the api path will be /product-description
and we can send a productSummary
query parameter.
import Anthropic from "https://esm.sh/@anthropic-ai/sdk@0.20.4";
export default async (request: Request) => {
const productSummary = new URL(request.url).searchParams.get("productSummary");
if (!productSummary) {
return new Response("", { status: 200 });
}
const client = new Anthropic({ apiKey: Netlify.env.get("ANTHROPIC_API_KEY") });
const chatCompletion = await client.messages.create({
model: "claude-3-opus-20240229",
max_tokens: 1024,
system:
"You're an ecommerce store. I need you to make the best description possible with the information I provide.",
messages: [{ role: "user", content: productSummary }],
});
return new Response(chatCompletion?.content?.[0]?.text);
};
export const config = {
path: "/product-description",
};
Step 4
Your API is ready, now we need to call it. Here’s some basic HTML/JS to see this working.
<input type="text" placeholder="Info about the product" id="summary" />
<button id="generate-button">Generate Product Description</button>
<div id="desc-target"></div>
<script>
const summary = document.getElementById("summary");
const generateBtn = document.getElementById("generate-button");
const description = document.getElementById("desc-target");
generateBtn.addEventListener("click", async () => {
description.innerText = "generating description...";
const descResp = await fetch(`/product-description?productSummary=${summary.value}`);
const desc = await descResp.text();
description.innerText = desc;
});
</script>
Done! The more information/specs about the product that’s provided the better the results.
🛹 Prompt: “Eric Koston Girl Basic Logo Skatebord”
Generated: “Ride in style with the Eric Koston Girl Basic Logo Skateboard. This high-quality skateboard features a sleek and minimalist design, perfect for showcasing your love for the iconic Girl Skateboards brand. The deck is made from durable 7-ply maple wood, providing a smooth and responsive ride for skaters of all skill levels.
The top of the deck features Girl's classic logo in bold, black text against a clean white background. The understated design is a testament to Eric Koston's timeless style and the brand's commitment to simplicity and functionality.
The skateboard measures 8.25 inches wide and 31.85 inches long, making it suitable for a wide range of skating styles and techniques. The medium concave shape offers excellent control and stability, while the slightly tapered tail and nose provide ample pop for ollies and other tricks.
The Eric Koston Girl Basic Logo Skateboard comes fully assembled and ready to ride, with high-quality trucks, wheels, and bearings that ensure a smooth and responsive experience. Whether you're cruising the streets or hitting the skate park, this board is built to perform.
As a signature model from one of the most influential skaters of all time, the Eric Koston Girl Basic Logo Skateboard is a must-have for any serious skateboarder or collector. Don't miss your chance to own a piece of skateboarding history and elevate your riding style with this iconic deck.”
Using OpenAI with tool/function calling
Let’s try something a little more advanced using a different tool. In this example, we want to provide an experience to help our customers who are traveling. One way to help is suggesting what clothes to pack for a given travel location and expected weather. Inference models do not have up-to-date information so they need a way to get this information to continue the request.
To achieve this, we will create an endpoint that defines a “tool” that must be called to determine what to do next. With this feature, OpenAI’s model is informed that, when certain questions are asked, it should consult the list of tools for what to do next. In this case, it will call this function when it needs to dynamically look up the weather.
Step 1
Set up an OpenAI account and create an API key.
Step 2
In your site settings create an environment variable for OPENAI_API_KEY
Step 3
Create your function. Below we use OpenAI’s runTools
function that will process the prompt, select the right tool, and use the result to complete the prompting.
import OpenAI from "https://esm.sh/openai@4.33.0";
export default async (request: Request) => {
const locationInput = new URL(request.url).searchParams.get("location");
if (!locationInput) {
return new Response("Where are you traveling to?");
}
const client = new OpenAI({ apiKey: Netlify.env.get("OPENAI_API_KEY") });
const toolRunner = await client.beta.chat.completions.runTools({
messages: [
{
role: "system",
content:
"You're a helpful travel agent informing the user what clothes they should pack. When deciding what clothes to offer, you will be informed of the travel destination. Ensure that the clothes are appropriate for the weather.",
},
{ role: "user", content: `I'm traveling to "${locationInput}"` },
],
tools: [
{
type: "function",
function: {
function: getWeather,
parse: JSON.parse,
parameters: {
type: "object",
properties: {
location: { type: "string" },
},
},
},
},
],
model: "gpt-4-turbo",
});
const suggestion = await toolRunner.finalContent();
return new Response(suggestion);
};
// called automatically when the model believes it needs to get the weather
async function getWeather(args: { location: string }) {
const { location } = args;
// call weather API
return { temperature: "98F", precipitation: "humid" }; // it's hot everywhere :)
}
export const config = {
path: "/packing-suggestions",
};
Step 4
Let’s start using the API from the website.
<input type="text" placeholder="Location of travel" id="location" />
<button id="generate-button">Suggest clothes for travel</button>
<div id="suggestion-target"></div>
<script>
const locationInput = document.getElementById("location");
const generateBtn = document.getElementById("generate-button");
const suggestionTarget = document.getElementById("suggestion-target");
generateBtn.addEventListener("click", async () => {
suggestionTarget.innerText = "generating packing suggestions...";
const resp = await fetch(`/packing-suggestions?location=${locationInput.value}`);
const suggestion = await resp.text();
suggestionTarget.innerText = suggestion;
});
</script>
Now you have a simple endpoint for helping your users decide how to prepare for their trip.
☀️ “For your trip to Atlanta, Georgia, where the current temperature is 98°F and it's quite humid, it would be best to pack lightweight and breathable clothing. Here are some suggestions:
Light Cotton Shirts/Tops: Help you stay cool.
Shorts or Loose Pants: Opt for breathable materials.
A Wide-Brimmed Hat: To protect from the sun.
Sunglasses and Sunscreen: Essential for sun protection.
Comfortable Sandals or Breathable Shoes: Keep your feet cool.
A Lightweight Jacket or Sweater: For cooler indoor environments due to air conditioning.
Make sure you stay hydrated and take breaks in the shade or indoors as it seems quite hot and humid out there!”
Advanced workflows
These examples are just the beginning of what can be achieved on Netlify. Customers are building out very complex background data processing workflows on Netlify with the ease thanks to the automatic scaling and managed infrastructure. Retrieval augmented generation (RAG) is as easy as selecting a retrieval data store (e.g. Pinecone), using async functions to populate the database, and make the retrieval and inference calls within your endpoints. Combine this with framework techniques like generative UI and you have some amazing opportunities!
Securing AI endpoints
As you build out your web experiences, you need to ensure bad actors are not able to abuse your endpoints causing outages or increased provider billing.
Here are some basic initial steps to ensure you’re building in a secure manner
- Store API keys as environment variables. Marking them as secrets will add more protections. Use the Netlify UI, API, or CLI to add these environment variables and avoid committing them to your source code.
- Create different API keys for your different deploy contexts. Production should have a different API key than development or pre-production. This reduces the chance of those API keys be exposed through development environments on your local machines.
- Treat data sent to your endpoints and generated data as unsafe input. Ensure the expected data is being provided to your endpoints and enforce nothing else is being sent. Generated content from model inference can be erroneous or produce unsafe content. If you’re using generated content elsewhere in your system (such as putting it in a DB, using it in queries, presenting it on the UI) you should treat this content similarly as user content and protect your system from dangerous values.
Controlling Access to your endpoints
Netlify has DDoS protections in place for all customers that constantly protects against global attacks. All sites are different and you know your users and usecases better than Netlify, so you can leverage Netlify’s capabilities to add more protections for who can access the site itself and to what degree they should be using it. Here are some ways of achieving that:
-
Ensure the expected clients are calling your endpoints. This is the basics around ensuring the client is who you expect them to be. For example, ensuring that the
Origin
header matches your site. If you only expect authenticated users, you should check the session information as well. - Enable access control protection for your site. When you create PRs and code branches, you’re usually creating preview environments to verify changes. This creates a URL for your team to access and test changes. These are extremely useful but it’s not very often that these are required to be publicly accessible for anyone to use. If you want to lock down who has access to these environments, you can enable password protections, IP requirements, or even require the end user to be someone on your Netlify team account to access the site.
- Add rate limits that match your expected traffic patterns. Netlify has very powerful rate limiting capabilities that can target broad sets of users or very granular areas of your site. With these capabilities, you can assess your AI endpoints and enforce rules based on your expectations of usage. For example, our endpoint to generate suggestions for clothes to pack, we do not expect this to be called many times by the same user within a short period of time. I would create a rate limiting rule that enforces a limit that no one device should call that endpoint more than 5 times a minute. Whether it’s abuse or a bug on the site’s code, you’ll be protected from these endpoints being called too many times.
Performance optimization
A key challenge in building with AI tools is performance. If you’re expecting a model inference call to be as fast as a traditional database read, you’ll have a harsh awakening. Over time, we expect that inference providers are going to get faster but until then here are some opportunities to consider when creating more performant AI experiences.
- Caching results. The de facto initial perf consideration is deciding if a given response from your endpoint can be cached on the CDN or the browser for some period of time. Netlify has industry leading controls for explicit caching/purging rules for the CDN.
- Blob storage of results. This is another form of caching but, unlike browser/CDN cache, this storage would be global. Let’s say you built an AI experience that processes many large documents to compute a summary. If your end users are teams that could ask to process the same large documents, you would want to reuse the results of the initial processing. This is where Netlify Blobs is very powerful. They can be easily used to store/fetch data saving time and money for intensive AI calculations. In addition to performance, this approach is also useful for consistency of results. AI tools tend to generate slightly or significantly different results each time they are called with the same information - that can cause confusion when your users expect the same information.
-
Pick the right model, then pick the right provider. As mentioned, Netlify’s platform allows teams to pick the right tool for the job. As you evaluate models, you might decide on a model that’s well supported across many platforms. If that’s the case, it’s a good opportunity to assess if you’ll have a faster experience with a different provider. For example, if you decide to use the open
Llama 2 7B
model from Meta, then you’d find it’s supported across many providers. Choosing a provider like Groq might provide substantial performance benefits because they are focused on high performance inference that includes that model.
Wrapping up
The key takeaways for building AI on Netlify
- Start with a problem that AI should solve.
- Decide your composable AI stack. With Netlify, you can evolve this stack over time.
- Start building by leveraging Netlify’s infrastructure - you focus on the feature, we will handle the infra.
- As you advance in complexity, Netlify will be there to let you choose the right tools and support the intensive workloads and patterns you’re building.
- Secure your endpoints using protections in your code and granular access controls.
- Improve the performance and consistency of AI experiences leveraging the caching, data stores, and the right providers.