TL;DR

For me, AI is everywhere. Everyone wants to do AI.
But sometimes, it is hard to know which tools to master to implement AI features in your apps successfully.

So, I have curated a list of repositories where you can learn to master AI witchcraft.

1. Composio 👑: Build AI automation 10x faster 🚀

Tools and integrations form the core of building AI agents.

I have been building AI tools and agents, but tool accuracy was always an issue until I came across Composio.

Composio makes integrating popular applications like GitHub, Slack, Jira, Airtable, and easier with AI agents to build complex automation.

It handles user authentication and authorization for integrations on your users' behalf. So you can build your AI applications in peace. And it’s SOC2 certified.

So, here’s how you can get started with it.

Python

pip install composio-core

Add a GitHub integration.

composio add github

Composio handles user authentication and authorization on your behalf.

Here is how you can use the GitHub integration to star a repository.

from openai import OpenAI
from composio_openai import ComposioToolSet, App

openai_client = OpenAI(api_key="******OPENAIKEY******")

# Initialise the Composio Tool Set
composio_toolset = ComposioToolSet(api_key="**\\*\\***COMPOSIO_API_KEY**\\*\\***")

## Step 4
# Get GitHub tools that are pre-configured
actions = composio_toolset.get_actions(actions=[Action.GITHUB_ACTIVITY_STAR_REPO_FOR_AUTHENTICATED_USER])

## Step 5
my_task = "Star a repo ComposioHQ/composio on GitHub"

# Create a chat completion request to decide on the action
response = openai_client.chat.completions.create(
model="gpt-4-turbo",
tools=actions, # Passing actions we fetched earlier.
messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": my_task}
  ]
)

Run this Python script to execute the given instruction using the agent.

Composio works with famous frameworks like LangChain, LlamaIndex, CrewAi, etc.

For more information, visit the official docs, and for even more complex examples, see the repository's example sections.

Star the Composio repository ⭐

2. Unsloth: Faster training and finetuning of AI models

Training and fine-tuning Large Language Models (LLMs) are crucial parts of AI engineering.

In many cases, proprietary models may not serve the purpose. It could be cost, personalization, or privacy. At some point, you will need to fine-tune your model on a custom dataset. And right now, Unsloth is one of the best libraries for fine-tuning and training LLMs.

It supports full, LoRA, and QLoRA finetuning of popular LLMs, including Llama-3 and Mistral, and their derivatives like Yi, Open-hermes, etc. It implements custom triton kernels and a manual back-prop engine to improve the speed of the model training.

To start with Unsloth, install it using pip and make sure you have torch 2.4 and CUDA 12.1.

pip install --upgrade pip
pip install "unsloth[cu121-torch240] @ git+https://github.com/unslothai/unsloth.git"

Here is a simple script to train a Mistral model on a dataset using SFT (Supervised Fine-tuning)

from unsloth import FastLanguageModel 
from unsloth import is_bfloat16_supported
import torch
from trl import SFTTrainer
from transformers import TrainingArguments
from datasets import load_dataset
max_seq_length = 2048 # Supports RoPE Scaling internally, so choose any!
# Get LAION dataset
url = "https://huggingface.co/datasets/laion/OIG/resolve/main/unified_chip2.jsonl"
dataset = load_dataset("json", data_files = {"train" : url}, split = "train")

# 4bit pre quantized models we support for 4x faster downloading + no OOMs.
fourbit_models = [
    "unsloth/mistral-7b-v0.3-bnb-4bit",      # New Mistral v3 2x faster!
] # More models at https://huggingface.co/unsloth

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/llama-3-8b-bnb-4bit",
    max_seq_length = max_seq_length,
    dtype = None,
    load_in_4bit = True,
)

# Do model patching and add fast LoRA weights
model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    max_seq_length = max_seq_length,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

trainer = SFTTrainer(
    model = model,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    tokenizer = tokenizer,
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 10,
        max_steps = 60,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        output_dir = "outputs",
        optim = "adamw_8bit",
        seed = 3407,
    ),
)
trainer.train()

For more information, refer to the official documentation.

Star the Unsloth repository ⭐

3. DsPy: Framework for programming LLMs

One factor hampering the use of LLMs in production use cases is their stochastic nature. Prompting them to output the desired response has a high failure rate for these use cases.

DsPy is solving for this problem. Instead of prompting, it programs the LLMs to get maximum reliability.

DSPy simplifies this by doing two key things:

Separating Program Flow from Parameters: This feature keeps your program's flow (the steps you take) separate from the details of how each step is done (the LM prompts and weights). This makes it easier to manage and update your system.
Introducing New Optimizers: DSPy uses advanced algorithms that automatically fine-tune the LM prompts and weights based on your goals, such as improving accuracy or reducing errors.

Check out this Getting Started Notebook for more on how to work with DsPy.

Star the DsPy repository ⭐

4. TaiPy: Build AI web apps faster with Python

Taipy is open-source, Python-based software designed for building AI web apps in production environments. It enhances Streamlit and Gradio by enabling Python developers to deploy demo apps in production.

Taipy is designed for data scientists and machine learning engineers to build data & AI web applications.

Enables building production-ready web applications
No need to learn new languages. Only Python is needed.
Concentrate on Data and AI algorithms without development and deployment complexities.

Quickly get started with it using pip.

pip install taipy

This simple Taipy application demonstrates how to create a basic film recommendation system using Taipy.

import taipy as tp
import pandas as pd
from taipy import Config, Scope, Gui

# Defining the helper functions

# Callback definition - submits scenario with genre selection
def on_genre_selected(state):
    scenario.selected_genre_node.write(state.selected_genre)
    tp.submit(scenario)
    state.df = scenario.filtered_data.read()

## Set initial value to Action
def on_init(state):
    on_genre_selected(state)

# Filtering function - task
def filter_genre(initial_dataset: pd.DataFrame, selected_genre):
    filtered_dataset = initial_dataset[initial_dataset["genres"].str.contains(selected_genre)]
    filtered_data = filtered_dataset.nlargest(7, "Popularity %")
    return filtered_data

# The main script
if __name__ == "__main__":
    # Taipy Scenario & Data Management

    # Load the configuration made with Taipy Studio
    Config.load("config.toml")
    scenario_cfg = Config.scenarios["scenario"]

    # Start Taipy Core service
    tp.Core().run()

    # Create a scenario
    scenario = tp.create_scenario(scenario_cfg)

    # Taipy User Interface
    # Let's add a GUI to our Scenario Management for a complete application

    # Get list of genres
    genres = [
        "Action", "Adventure", "Animation", "Children", "Comedy", "Fantasy", "IMAX"
        "Romance","Sci-FI", "Western", "Crime", "Mystery", "Drama", "Horror", "Thriller", "Film-Noir","War", "Musical", "Documentary"
    ]

    # Initialization of variables
    df = pd.DataFrame(columns=["Title", "Popularity %"])
    selected_genre = "Action"

    # User interface definition
    my_page = """
# Film recommendation

## Choose your favorite genre
<|{selected_genre}|selector|lov={genres}|on_change=on_genre_selected|dropdown|>

## Here are the top seven picks by popularity
<|{df}|chart|x=Title|y=Popularity %|type=bar|title=Film Popularity|>
    """

    Gui(page=my_page).run()

Check out the documentation for more.

Star Taipy repository ⭐

5. Phidata: Build LLM agents with memory.

Often, building agents that work may not be as easy as it sounds. Managing memory, caching, and tool execution can become challenging.

Phidata is an open-source framework that offers a convenient and reliable way to build agents with long-term memory, contextual knowledge, and the ability to take action using function calls.

Get started with Phidata by installing via pip

pip install -U phidata

Let’s create a simple assistant that can query the financial data.

from phi.assistant import Assistant
from phi.llm.openai import OpenAIChat
from phi.tools.yfinance import YFinanceTools

assistant = Assistant(
    llm=OpenAIChat(model="gpt-4o"),
    tools=[YFinanceTools(stock_price=True, analyst_recommendations=True, company_info=True, company_news=True)],
    show_tool_calls=True,
    markdown=True,
)
assistant.print_response("What is the stock price of NVDA")
assistant.print_response("Write a comparison between NVDA and AMD, use all tools available.")

An assistant that can surf the web.

from phi.assistant import Assistant
from phi.tools.duckduckgo import DuckDuckGo

assistant = Assistant(tools=[DuckDuckGo()], show_tool_calls=True)
assistant.print_response("Whats happening in France?", markdown=True)

Refer to the official documentation for examples and information.

Star Phidata repository ⭐

6. Phoenix: LLM observability made efficient

Building AI applications is only completed by adding an observability layer. Usually, an LLM application has many moving parts, such as prompts, model temperature, p-value, etc., which can significantly impact outcomes even with a slight change.

This can make the applications highly unstable and unreliable. This is where LLM observability comes into the picture. ArizeAI’s Phoneix makes it convenient to track the entire trace of an LLM execution.

It is an open-source AI observability platform designed for experimentation, evaluation, and troubleshooting. It provides:

Tracing - Trace your LLM application's runtime using OpenTelemetry-based instrumentation.
Evaluation - Leverage LLMs to benchmark your application's performance using response and retrieval evals.
Datasets - Create versioned datasets of examples for experimentation, evaluation, and fine-tuning.
Experiments - Track and evaluate prompts, LLMs, and retrieval changes.

Phoenix is vendor and language-agnostic, supporting frameworks like LlamaIndex, LangChain, DSPy, and LLM providers like OpenAI and Bedrock.

It can run in various environments, including Jupyter notebooks, local machines, containers, or the cloud.

It is easy to get started with Phoneix.

pip install arize-phoenix

To get started, launch the Phoenix app.

import phoenix as px
session = px.launch_app()

This will initiate the Phoneix server.

You can now set up tracking for your AI application to debug your application as the traces stream in.

To use LlamaIndex's one click, you must install the small integration first:

pip install 'llama-index>=0.10.44'

import phoenix as px
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor
import os
from gcsfs import GCSFileSystem
from llama_index.core import (
    Settings,
    VectorStoreIndex,
    StorageContext,
    set_global_handler,
    load_index_from_storage
)
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
import llama_index

# To view traces in Phoenix, you will first have to start a Phoenix server. You can do this by running the following:
session = px.launch_app()

# Initialize LlamaIndex auto-instrumentation
LlamaIndexInstrumentor().instrument()

os.environ["OPENAI_API_KEY"] = "<ENTER_YOUR_OPENAI_API_KEY_HERE>"

# LlamaIndex application initialization may vary
# depending on your application
Settings.llm = OpenAI(model="gpt-4-turbo-preview")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

# Load your data and create an index. Here we've provided an example of our documentation
file_system = GCSFileSystem(project="public-assets-275721")
index_path = "arize-phoenix-assets/datasets/unstructured/llm/llama-index/arize-docs/index/"
storage_context = StorageContext.from_defaults(
    fs=file_system,
    persist_dir=index_path,
)

index = load_index_from_storage(storage_context)

query_engine = index.as_query_engine()

# Query your LlamaIndex application
query_engine.query("What is the meaning of life?")
query_engine.query("How can I deploy Arize?")

# View the traces in the Phoenix UI
px.active_session().url

Once you've executed a sufficient number of queries (or chats) for your application, you can view the details of the UI by refreshing the browser URL.

Refer to their documentation for more tracing, dataset versioning, and evaluation examples.

Star Phoenix repository ⭐

7. Airbyte: Reliable and Extensible data pipeline

Data is essential for building AI applications, especially in production, where you must manage large volumes of data from various sources. Airbyte excels at this.

Airbyte offers an extensive catalogue of over 300 connectors for APIs, databases, data warehouses, and data lakes.

Airbyte also features a Python extension called PyAirByte. This extension supports popular frameworks like LangChain and LlamaIndex, making it easy to move data from multiple sources to your GenAI applications.

Check out this notebook for details on the implementation of PyAirByte with LangChain.

For more information, check out the documentation.

Star AirByte repository ⭐

8. AgentOps: Agent monitoring and Observability

Just like traditional software systems, AI agents require continuous monitoring and observation. This is important to ensure the agent’s behaviour does not deviate from expectations.

AgentOps offers a comprehensive solution for monitoring and observing AI agents.

It offers tools for replay analytics, LLM cost management, agent benchmarking, compliance and security and integrates natively with frameworks like CrewAI, AutoGen, and LangChain.

Get started with AgentOps by installing it through pip.

pip install agentops

Initialize the AgentOps client and automatically get analytics on every LLM call.

import agentops

# Beginning of program's code (i.e. main.py, __init__.py)
agentops.init( < INSERT YOUR API KEY HERE >)

...

# (optional: record specific functions)
@agentops.record_action('sample function being record')
def sample_function(...):
    ...

# End of program
agentops.end_session('Success')
# Woohoo You're done 🎉

Refer to their documentation for more.

Star AgentOps repository ⭐

9. RAGAS: Framework for RAG evaluation

Building RAG pipelines is challenging, but determining their effectiveness in real-world scenarios is another. Despite advancements in frameworks for RAG applications, ensuring their reliability for real users remains difficult, especially when the cost of incorrect retrievals is high.

RAGAS is a framework designed to solve this problem. It helps you evaluate your Retrieval Augmented Generation (RAG) pipelines.

It helps you generate synthetic test sets, test your RAG pipelines against them, and monitor your RAG app in production.

Check out the documentation to understand how to use RAGAS to improve your new and existing RAG pipelines.

Star RAGAS repository ⭐

Thank you for reading this article. Comment below if you have built or used any other open-source AI repository.

9 essential open-source libraries to master as an AI developer 🧙‍♂️ 🪄