In this issue:
Milvus Magic: Image Search & Smart Shopping!
Multimodal RAG resources
New! GenAI Resource Hub
Community Spotlights
Learn Milvus Live!
🛍️Building a Multimodal Product Recommender Demo Using Milvus and Streamlit
Ever wished you could find products just by showing a picture and describing what you want? Well, now you can! 🛍️✨Upload an image, enter text instructions, and find the closest-matching Amazon products from the Milvus vector database. Follow the step-by-step tutorial here.
🧙Key technologies to make the magic happen:
🔎MagicLens - multimodal embedding model using a dual-encoder architecture to process text and images based on CLIP (OpenAI 2021) or CoCa (Google Research 2022).
💻OpenAI’s GPT-4o - generative Multimodal Large Language Model by OpenAI that integrates text, images, and other data types into a single model, enhancing traditional language models.
🐦Milvus - open-source, distributed vector database for storing, indexing, and searching vectors for Generative AI workloads.
🌐Streamlit - open-source Python library that simplifies creating and running web applications.
🧠 Multimodal RAG Resources
Don’t know where to get started with Multimodal RAG? Check out these resources from basic to advanced!
LEARN - What is Multimodal RAG?
Multimodal RAG is an extended RAG framework incorporating multimodal data including various data types such as text, images, audio, videos, etc. Real-world applications, challenges, advantages, and more are highlighted below.
STEP ONE - Multimodal Embeddings
What’s the first step to building a multimodal retrieval augmented generation (RAG) app? Getting multimodal vector embeddings. Use CLIP to create the embeddings of the input data, Milvus to store the embeddings of the multimodal data (sometimes termed “multimodal embeddings”), and FiftyOne to explore the embeddings.
🐴🚗 “Pony” is clearly a horse, “Ferrari” is clearly a car, but “Mustang” could be either.
Let’s compare images of words that may have different meanings in different contexts.
Exploring Multimodal Embeddings with FiftyOne and Milvus
FOLLOW ALONG - Multimodal RAG Pipelines
This resource focuses on how to use data to build a better multimodal RAG pipeline. It emphasizes using free and open-source tools, specifically leveraging FiftyOne for data management and visualization, Milvus as a vector store, and LlamaIndex for orchestrating large language models (LLMs).
Build Better Multimodal RAG Pipelines with FiftyOne, LlamaIndex, and Milvus
FOLLOW ALONG: Multimodal RAG locally with CLIP and Llama3
The core idea behind the CLIP (Contrastive Language-Image Pretraining) model is to understand the connection between a picture and text. Learn how to generate embeddings with the CLIP ViT-B/32 model and use Llama3 as the LLM to build multimodal RAG.
GenAI Resource Hub
The new GenAI Resource Hub has Tutorials, Code Examples, and Best Practices for Developing and Deploying GenAI Applications.
🎓 Learn
Basic concepts developers need to understand to create RAG/GenAl applications
🛠️ Build
Practical resources to help you build sample RAG/GenAl applications
🌎 Explore
After building your GenAl/RAG demo, learn what it takes to deploy it into production effectively
👥 Community Spotlights
We see you 👀building cool apps with Milvus! Need inspiration? Check out these fun projects 🔥:
🩺MediSaga: A Retrieval Augmented Generation (RAG) chatbot application designed to answer medical questions by KF Surya!
https://www.linkedin.com/feed/update/urn:li:activity:7221854199649607680/
📄PDF Content Analyzer: a web application that extracts, processes, and analyzes content from PDF files by sarthakgarg07.
📚Textbook Question Answering System: This system uses advanced natural language processing and machine learning techniques to answer questions based on textbook content by Aryaman Tiwari
https://www.linkedin.com/feed/update/urn:li:activity:7221782938726588417/
🎓 Learn Milvus Live!
Join us for upcoming virtual and in-person events to learn Milvus live.
Aug 8: Building an Agentic RAG locally with Milvus, Ollama, and Llama Agents (virtual)
With the recent release of Llama Agents, we can now build agents that are async first and run as their own service. During this webinar, Stephen will show you how to build an Agentic RAG System using Llama Agents and Milvus.
Aug 13: South Bay Unstructured Data Meetup (in-person)
We’ll be back at SAP in Palo Alto for our meetup! Talks from TwelveLabs, Zilliz, and more coming soon.
Aug 13: New York Unstructured Data Meetup (in-person)
We got a stacked speaker lineup for the New York meetup! Join us for the following AI talks:
▶️ Quick intro to unstructured data, edge ai and Milvus
▶️ Modern Analytics & Reporting with Milvus Vector DB and GenAI
▶️ cuVS+Milvus
▶️ Combining Hugging Face Transformer Models and Visual Data with FiftyOne
👾 Discord
Join our Discord channel to engage with our engineers and community members.
Enjoying Milvus?
⭐ Give us a Star on GitHub!