13 Must-know Open-source Software to Build Production-ready AI Apps 🧙‍♂️🪄✨

Sunil Kumar Dash - Aug 15 - - Dev Community

I've been developing both AI and non-AI applications for some time now. While creating a prototype can be relatively straightforward, building AI systems that are truly ready for the real world is a much more challenging task.

The software needs to be

  • Reliable and well-maintained.
  • Adhere to security standards (SOC2, ISO, GDPR, etc).
  • Scalable, Performant, Fail-safe, and so on.

Despite all the buzz around AI, the ecosystem for developing production-ready AI applications is still in its early stages.

steve ballmer shouting ai

However, considerable progress has been made recently, thanks to advancements in open-source software.

So, I have compiled a list of open-source software to help you build production-ready AI applications.

Click on the emojis to visit the section.

  1. Composio 👑 - Seamless Integration of Tools with LLMs 🔗
  2. Weaviate - The AI-native Database for AI Apps 🧠
  3. Haystack - Framework for Building Efficient RAG 🛠️
  4. Litgpt - Pretrain, Fine-tune, Deploy Models At Scale 🚀
  5. DsPy - Framework for Programming LLMs 💻
  6. Portkey’s Gateway - Reliably Route to 200+ LLMs with 1 Fast & Friendly API 🌐
  7. AirByte - Reliable and Extensible Open-source Data Pipeline 🔄
  8. AgentOps - Agents Observability and Monitoring 🕵️‍♂️
  9. ArizeAI’s Phoenix - LLM Observability and Evaluation 🔥
  10. vLLM - Easy, Fast, and Cheap LLM Serving for Everyone 💨
  11. Vercel AI SDK - Easily Build AI-powered Products
  12. LangGraph - Building Language Agents as Graphs 🧩
  13. Taipy - Build Python Data & AI web applications 💫

Feel free to star and contribute to the repositories.


1. Composio 👑: Seamless Integration of Tools with LLMs 🔗

I have built my tools for LLM tool calling and have used tools from LangChain and LLamaHub, but I was never satisfied with the accuracy, and many applications are unavailable.

However, this was not the case with Composio. It has over 100 tools and integrations, including but not limited to Gmail, Google Calendar, GitHub, Slack, Jira, etc.

It handles user authentication and authorization for integrations on your users' behalf. So you can build your AI applications in peace. And it’s SOC2 certified.

So, here’s how you can get started with it.

Python



pip install composio-core


Enter fullscreen mode Exit fullscreen mode

Add a GitHub integration.



composio add github


Enter fullscreen mode Exit fullscreen mode

Composio handles user authentication and authorization on your behalf.

Here is how you can use the GitHub integration to star a repository.



from openai import OpenAI
from composio_openai import ComposioToolSet, App

openai_client = OpenAI(api_key="******OPENAIKEY******")

# Initialise the Composio Tool Set
composio_toolset = ComposioToolSet(api_key="**\\*\\***COMPOSIO_API_KEY**\\*\\***")

## Step 4
# Get GitHub tools that are pre-configured
actions = composio_toolset.get_actions(actions=[Action.GITHUB_ACTIVITY_STAR_REPO_FOR_AUTHENTICATED_USER])

## Step 5
my_task = "Star a repo ComposioHQ/composio on GitHub"

# Create a chat completion request to decide on the action
response = openai_client.chat.completions.create(
model="gpt-4-turbo",
tools=actions, # Passing actions we fetched earlier.
messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": my_task}
  ]
)


Enter fullscreen mode Exit fullscreen mode

Run this Python script to execute the given instruction using the agent.

Javascript

You can Install it using npmyarn, or pnpm.



npm install composio-core


Enter fullscreen mode Exit fullscreen mode

Define a method to let the user connect their GitHub account.



import { OpenAI } from "openai";
import { OpenAIToolSet } from "composio-core";

const toolset = new OpenAIToolSet({
  apiKey: process.env.COMPOSIO_API_KEY,
});

async function setupUserConnectionIfNotExists(entityId) {
  const entity = await toolset.client.getEntity(entityId);
  const connection = await entity.getConnection('github');

  if (!connection) {
      // If this entity/user hasn't already connected, the account
      const connection = await entity.initiateConnection(appName);
      console.log("Log in via: ", connection.redirectUrl);
      return connection.waitUntilActive(60);
  }

  return connection;
}


Enter fullscreen mode Exit fullscreen mode

Add the required tools to the OpenAI SDK and pass the entity name on to the executeAgent function.



async function executeAgent(entityName) {
  const entity = await toolset.client.getEntity(entityName)
  await setupUserConnectionIfNotExists(entity.id);

  const tools = await toolset.get_actions({ actions: ["github_activity_star_repo_for_authenticated_user"] }, entity.id);
  const instruction = "Star a repo ComposioHQ/composio on GitHub"

  const client = new OpenAI({ apiKey: process.env.OPEN_AI_API_KEY })
  const response = await client.chat.completions.create({
      model: "gpt-4-turbo",
      messages: [{
          role: "user",
          content: instruction,
      }],
      tools: tools,
      tool_choice: "auto",
  })

  console.log(response.choices[0].message.tool_calls);
  await toolset.handle_tool_call(response, entity.id);
}

executeGithubAgent("joey")


Enter fullscreen mode Exit fullscreen mode

Execute the code and let the agent do the work for you.

Composio works with famous frameworks like LangChain, LlamaIndex, CrewAi, etc.

For more information, visit the official docs, and for even more complex examples, see the repository's example sections.

Composio GIF

Star the Composio repository ⭐


2. Weaviate: The AInative Database for AI Apps 🧠

If you want to build AI applications that depend on semantic retrieval, you need a vector database. Unlike traditional databases, a vector database can manage high-dimensional vector embeddings efficiently. It becomes essential for RAG-based applications.

Weaviate is one of the leading AI-native vector databases. It is fast, efficient, scalable, and has a rapidly growing community of developers.

It has SDKs for multiple programming languages, including Go, Python, JS/TS, and Java.

To get started with Weaviate install



pip install -U weaviate-client


Enter fullscreen mode Exit fullscreen mode

Import the modules and create a Weavite Client.



import weaviate
import weaviate.classes as wvc
import os
import requests
import json

# Best practice: store your credentials in environment variables
wcd_url = os.environ["WCD_DEMO_URL"]
wcd_api_key = os.environ["WCD_DEMO_RO_KEY"]
openai_api_key = os.environ["OPENAI_APIKEY"]

client = weaviate.connect_to_weaviate_cloud(
    cluster_url=wcd_url,                                    # Replace with your Weaviate Cloud URL
    auth_credentials=wvc.init.Auth.api_key(wcd_api_key),    # Replace with your Weaviate Cloud key
    headers={"X-OpenAI-Api-Key": openai_api_key}            # Replace with appropriate header key/value pair for the required API
)


Enter fullscreen mode Exit fullscreen mode

Create a collection, load the data, and close the client.



try:
    # ===== define collection =====
    questions = client.collections.create(
        name="Question",
        vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(),  # If set to "none", you must always provide vectors yourself. It could be any other "text2vec-*" also.
        generative_config=wvc.config.Configure.Generative.openai()  # Ensure the `generative-open` module is used for generative queries.
    )

    # ===== import data =====
    resp = requests.get('https://raw.githubusercontent.com/weaviate-tutorials/quickstart/main/data/jeopardy_tiny.json')
    data = json.loads(resp.text)  # Load data

    question_objs = list()
    for i, d in enumerate(data):
        question_objs.append({
            "answer": d["Answer"],
            "question": d["Question"],
            "category": d["Category"],
        })

    questions = client.collections.get("Question")
    questions.data.insert_many(question_objs)

finally:
    client.close()  # Close client gracefully


Enter fullscreen mode Exit fullscreen mode

Now, query the collection.



try:
    # Replace with your code. Close the client gracefully in the final block.
    questions = client.collections.get("Question")

    response = questions.query.near_text(
        query="biology",
        limit=2
    )

    print(response.objects[0].properties)  # Inspect the first object

finally:
    client.close()  # Close client gracefully


Enter fullscreen mode Exit fullscreen mode

For more information and implementation for other clients, refer to Weaviate’s documentation.

weaviate

Star the Weaviate repository ⭐


3. Haystack: Framework for Building Efficient RAG 🛠️

If I were building a real-world RAG application, Haystack would be my choice. It is an orchestration framework for efficiently building RAG pipelines. It’s easy to use and reliable, with all the bells and whistles, such as re-rankers, document loaders, vector db support, RAG evaluators, etc.

Whether it's RAG, Q&A, or semantic searches, Haystack's highly composable pipelines make development, maintenance, and deployment a breeze.

Haystack is a Python-only framework; you can install it using pip.



pip install haystack-ai



Enter fullscreen mode Exit fullscreen mode

Now, build your first RAG Pipeline with Haystack components.



import os

from haystack import Pipeline, PredefinedPipeline
import urllib.request

os.environ["OPENAI_API_KEY"] = "Your OpenAI API Key"
urllib.request.urlretrieve("https://www.gutenberg.org/cache/epub/7785/pg7785.txt", "davinci.txt")

indexing_pipeline =  Pipeline.from_template(PredefinedPipeline.INDEXING)
indexing_pipeline.run(data={"sources": ["davinci.txt"]})

rag_pipeline =  Pipeline.from_template(PredefinedPipeline.RAG)

query = "How old was he when he died?"
result = rag_pipeline.run(data={"prompt_builder": {"query":query}, "text_embedder": {"text": query}})
print(result["llm"]["replies"][0])



Enter fullscreen mode Exit fullscreen mode

For more tutorials and concepts, check out their documentation.

haystack gif

Star the Haystack repository ⭐


4. LitGPT: Pretrain, Finetune, Deploy Models At Scale 🚀

In many cases, proprietary LLMs may not be sufficient. It could be cost or hyper-personalization. You will need LLMs personalized for your particular use cases. This is where model fine-tuning comes into the picture.

LitGPT from Lightning AI is arguably the best repository out there for fine-tuning LLMs on custom data. It offers 20+ LLMs to pre-train, fine-tune, and deploy at scale.

Every LLM is implemented from scratch with no abstractions and full control, making it blazing fast, minimal, and performant at the enterprise scale.

It supports popular recipes such as LoRA, QLoRA, FSDP, DPO, and PPO to fine-tune LLMs. It also has free Jupyter Notebooks with Nvidia Tesla T4 and Nvidia A10 GPUs in the Lightning Studio to play around with.

Install LitGPT



pip install 'litgpt[all]'


Enter fullscreen mode Exit fullscreen mode

Load and use any of the 20+ LLMs:



from litgpt import LLM

llm = LLM.load("microsoft/phi-2")
text = llm.generate("Fix the spelling: Every fall, the familly goes to the mountains.")
print(text)
# Corrected Sentence: Every fall, the family goes to the mountains.


Enter fullscreen mode Exit fullscreen mode

Check out the repository on various implementation codes.

litgpt

Star the Litgpt repository ⭐


5. DsPy: Framework for Programming LLMs 💻

One of the main challenges hindering the widespread integration of LLMs into existing software systems is their stochastic nature. While these models generate the most probable outcomes for a given task, this contrasts with the deterministic nature of traditional software development.

It requires tedious, prompt engineering. Hacking and fine-tuning. DsPy is bridging the gap. It offers a systematic way of working with LLMs.

DSPy from Stanford simplifies this by doing two key things:

  1. Separating Program Flow from Parameters: This feature keeps your program's flow (the steps you take) separate from the details of how each step is done (the LM prompts and weights). This makes it easier to manage and update your system.
  2. Introducing New Optimizers: DSPy uses advanced algorithms that automatically fine-tune the LM prompts and weights based on your goals, such as improving accuracy or reducing errors.

Check out this Getting Started Notebook for more on how to work with DsPy.

dspy

Star the DsPy repository ⭐


6. Portkey’s Gateway: Reliably Route to over 200 LLMs with a unified API 🌐

While building AI applications, we usually depend on hosted LLMs from different providers; what if the model goes down? This can be very costly to businesses. You need an efficient model router to route requests from one provider to another during downtime.

PortKey does exactly the same thing and more. It provides a unified API for more than 200 LLM providers. It supports caching, load-balancing, routing, and retries and can be edge-deployed for minimum latency.

This is an essential piece in building fault-tolerant, robust AI systems. It is available in Python, Go, Rust, Java, Ruby, and Javascript.

Get started with Gateway by installing it.



pip install -qU portkey-ai openai


Enter fullscreen mode Exit fullscreen mode

For OpenAI models,



from openai import OpenAI
from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders

client = OpenAI(
    api_key=OPENAI_API_KEY,
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=createHeaders(
        provider="openai",
        api_key=PORTKEY_API_KEY
    )
)

chat_complete = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user",
               "content": "What's a fractal?"}],
)

print(chat_complete.choices[0].message.content)


Enter fullscreen mode Exit fullscreen mode

For Anthropic models,



from openai import OpenAI
from portkey_ai import PORTKEY_GATEWAY_URL, createHeaders

client = OpenAI(
    api_key=userdata.get('ANTHROPIC_API_KEY')
    base_url=PORTKEY_GATEWAY_URL,
    default_headers=createHeaders(
        provider="anthropic",
        api_key=PORTKEY_API_KEY
    ),
)

response = client.chat.completions.create(
    model="claude-3-opus-20240229",
    messages=[{"role": "user",
               "content": "What's a fractal?"}],
    max_tokens= 512
)


Enter fullscreen mode Exit fullscreen mode

For more information, visit the official repository.

Gateway GIF

Star the Gateway repository ⭐


7. AirByte: Reliable and Extensible Open source Data Pipeline 🔄

Data is critical for building AI applications, and in production, you would need to handle a large amount of data from multiple sources. This is where AIrByte shines.

Airbyte provides the largest catalog of 300+ connectors for APIs, databases, data warehouses, and data lakes.
AirByte has a Python extension called PyAirByte. It supports popular frameworks like LangChain and LlamaIndex, allowing you to move data from multiple sources to your GenAI applications conveniently.

Check out this notebook for details on the implementation of PyAirByte with LangChain.

For more information, check out the documentation.

airbyte

Star the Airbyte repository ⭐


8. AgentOps: Agents Observability and Monitoring 🕵️‍♂️

Like traditional software systems, AI agents also require constant observation, monitoring, and improvements for optimum performance.

AgentOps is the leading solution in this space. It helps developers build, evaluate, and monitor AI agents. Tools to build agents from prototype to production.

It offers tools for replay analytics, LLM cost management, agent benchmarking, compliance and security and integrates natively with frameworks like CrewAI, AutoGen, and LangChain.

Get started with AgentOps by installing it through pip.



pip install agentops


Enter fullscreen mode Exit fullscreen mode

Initialize the AgentOps client and automatically get analytics on every LLM call.



import agentops

# Beginning of program's code (i.e. main.py, __init__.py)
agentops.init( < INSERT YOUR API KEY HERE >)

...

# (optional: record specific functions)
@agentops.record_action('sample function being record')
def sample_function(...):
    ...

# End of program
agentops.end_session('Success')
# Woohoo You're done 🎉


Enter fullscreen mode Exit fullscreen mode

Refer to their documentation for more.

agentops

Star the AgentOps repository ⭐


9. ArizeAI’s Phoenix: LLM Observability and Evaluation 🔥

As your AI app grows, keeping track of prompts, retrieval accuracy, and LLM responses will be difficult.

Phoenix is an open-source AI observability platform for experimentation, evaluation, and troubleshooting. It offers tracing with OpenTelemetry, performance benchmarking, versioned datasets, experiment tracking, and inference analysis through visual tools.

Phoenix is vendor and language-agnostic, supporting frameworks like LlamaIndex, LangChain, DSPy, and LLM providers like OpenAI and Bedrock.

It can run in various environments, including Jupyter notebooks, local machines, containers, or the cloud.

It is easy to get started with Phoneix.



'pip install arize-phoenix


Enter fullscreen mode Exit fullscreen mode

To get started, launch the Phoenix app.



import phoenix as px
session = px.launch_app()


Enter fullscreen mode Exit fullscreen mode

This will initiate the Phoneix server.

Now that Phoenix is up and running, you can set up tracking for your AI application to debug your application as the traces stream in.

To use LlamaIndex's one click, you must install the small integration first:



pip install 'llama-index>=0.10.44'


Enter fullscreen mode Exit fullscreen mode


import phoenix as px
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
import os
from gcsfs import GCSFileSystem
from llama_index.core import (
    Settings,
    VectorStoreIndex,
    StorageContext,
    set_global_handler,
    load_index_from_storage
)
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
import llama_index

# To view traces in Phoenix, you will first have to start a Phoenix server. You can do this by running the following:
session = px.launch_app()

# Initialize LlamaIndex auto-instrumentation
LlamaIndexInstrumentor().instrument()

os.environ["OPENAI_API_KEY"] = "<ENTER_YOUR_OPENAI_API_KEY_HERE>"

# LlamaIndex application initialization may vary 
# depending on your application
Settings.llm = OpenAI(model="gpt-4-turbo-preview")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

# Load your data and create an index. Here we've provided an example of our documentation
file_system = GCSFileSystem(project="public-assets-275721")
index_path = "arize-phoenix-assets/datasets/unstructured/llm/llama-index/arize-docs/index/"
storage_context = StorageContext.from_defaults(
    fs=file_system,
    persist_dir=index_path,
)

index = load_index_from_storage(storage_context)

query_engine = index.as_query_engine()

# Query your LlamaIndex application
query_engine.query("What is the meaning of life?")
query_engine.query("How can I deploy Arize?")

# View the traces in the Phoenix UI
px.active_session().url


Enter fullscreen mode Exit fullscreen mode

Once you've executed a sufficient number of queries (or chats) for your application, you can view the details of the UI by refreshing the browser URL.

Refer to their documentation for more tracing, dataset versioning, and evaluation examples.

arizeai

Star the Phoneix repository ⭐


10. VLLM: Easy, Fast, and Cheap LLM Serving for Everyone 💨

There are many cases where you want to use open-source AI models. It could be for privacy, convenience, and to avoid vendor lock-ins. You will need software that helps you host any AI model anywhere and lets you infer from it.

vLLM is a fast and easy-to-use library for LLM inference and serving. It supports major hardware providers NVIDIA, AMD, Intel, and Google CPUs and GPUs with state-of-the-art inference throughput.

It supports model quantization, tensor parallelism for distributed inference, continuous batching of incoming requests, and more.

You can start with vLLM easily by installing it from pip.



pip install vllm


Enter fullscreen mode Exit fullscreen mode

for offline batched inference



**from vllm import LLM, SamplingParams

prompts = [
    "Hello, my name is",
    "The president of the United States is",
    "The capital of France is",
    "The future of AI is",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

llm = LLM(model="facebook/opt-125m")

outputs = llm.generate(prompts, sampling_params)

# Print the outputs.
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")**


Enter fullscreen mode Exit fullscreen mode

You can find more on the documentation.

vllm

Star the vLLM repository ⭐


11. Vercel AI SDK: Easily Build AI powered Products ⚡

If I were to build a full-stack AI-powered application right now, I would pick Vercel AI SDK in a heartbeat.

It’s a toolkit designed to let developers build AI web apps with React, Vue, NEXT, Sveltekit, etc.

Vercel AI SDK abstracts LLM providers, eliminates boilerplate codes for building chatbots, and provides interactive visualization components to provide a rich customer experience.

It has three parts,

  • AI SDK Core: A single API for generating text, structured data, and tool interactions with LLMs.
  • AI SDK UI: Framework-independent hooks for quickly building chat and generative UIs.
  • AI SDK RSC: A library for streaming generative UIs with React Server Components (RSC).

To get started, install the library.



npm install ai


Enter fullscreen mode Exit fullscreen mode

Install the model provider of your choice.



npm install @ai-sdk/openai


Enter fullscreen mode Exit fullscreen mode

Call OpenAI API.



import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai'; // Ensure OPENAI_API_KEY environment variable is set

async function main() {
  const { text } = await generateText({
    model: openai('gpt-4-turbo'),
    system: 'You are a friendly assistant!',
    prompt: 'Why is the sky blue?',
  });

  console.log(text);
}

main();


Enter fullscreen mode Exit fullscreen mode

For more on Vercel AI SDK, visit their documentation.

Vercel AI SDK

Star the Vercel AI SDK repository ⭐


12. LangGraph - Building language agents as graphs 🧩

LangGraph is easily one of the most capable frameworks for building efficient and reliable AI agents. As the name suggests, it follows a cyclic graphical architecture, such as Nodes and Edges, to build AI agents.

It is an extension of LangChain, so it has a huge community of AI developers building on it.

Get started with it using pip.



pip install -U langgraph


Enter fullscreen mode Exit fullscreen mode

If you want to build agents/bots with LangGraph, check out our detailed blog on building a Gmail and Calendar assistant.

For more on LangGraph, visit the documentation.

langgraph

Star the LangGraph repository ⭐


13. Taipy: Build AI apps in Python 💫

Taipy is Python-based open-source software for building AI web apps in a production environment. It takes Stremlit and Gradio demos to the next level by allowing you to build production-ready data and AI apps.

Taipy is designed for data scientists and machine learning engineers to build data & AI web applications.

  1. Enables building production-ready web applications
  2. No need to learn new languages. Only Python is needed.
  3. Concentrate on Data and AI algorithms without development and deployment complexities.

Easily get started with it using pip.



pip install taipy


Enter fullscreen mode Exit fullscreen mode

This Taipy app demonstrates a basic film recommendation system. It filters a film dataset by genre and displays the top seven by popularity. Here's the complete frontend and backend code.



**import taipy as tp
import pandas as pd
from taipy import Config, Scope, Gui

# Taipy Scenario & Data Management

# Filtering function - task
def filter_genre(initial_dataset: pd.DataFrame, selected_genre):
    filtered_dataset = initial_dataset[initial_dataset["genres"].str.contains(selected_genre)]
    filtered_data = filtered_dataset.nlargest(7, "Popularity %")
    return filtered_data

# Load the configuration made with Taipy Studio
Config.load("config.toml")
scenario_cfg = Config.scenarios["scenario"]

# Start Taipy Core service
tp.Core().run()

# Create a scenario
scenario = tp.create_scenario(scenario_cfg)

# Taipy User Interface
# Let's add a GUI to our Scenario Management for a full application

# Callback definition - submits scenario with genre selection
def on_genre_selected(state):
    scenario.selected_genre_node.write(state.selected_genre)
    tp.submit(scenario)
    state.df = scenario.filtered_data.read()

# Get the list of genres
genres = [
    "Action", "Adventure", "Animation", "Children", "Comedy", "Fantasy", "IMAX"
    "Romance", "Sci-FI", "Western", "Crime", "Mystery", "Drama", "Horror", "Thriller", "Film-Noir", "War", "Musical", "Documentary"
    ]

# Initialization of variables
df = pd.DataFrame(columns=["Title", "Popularity %"])
selected_genre = "Action"

## Set initial value to Action
def on_init(state):
    on_genre_selected(state)

# User interface definition
my_page = """
# Film recommendation

## Choose your favourite genre
<|{selected_genre}|selector|lov={genres}|on_change=on_genre_selected|dropdown|>

## Here are the top seven picks by popularity
<|{df}|chart|x=Title|y=Popularity %|type=bar|title=Film Popularity|>
"""

Gui(page=my_page).run()**


Enter fullscreen mode Exit fullscreen mode

Check out the documentation for more.

taipy

Star the Taipy repository ⭐


Thanks for reading the article. If you have more on your mind, do let me know in the comments below. 👇

. . . . . . . . . . . . . . . . . . . . . .