RAG Explained: Indexing and Vector Database

Ďēv Šhãh 🥑 - Nov 2 - - Dev Community

Disclaimer

This blog reflects my learnings from Augment your LLM Using Retrieval Augmented Generation by NVIDIA. The content is based on topics covered in the course and my understanding of them through some practical examples. If you are not sure, what RAG is, I would suggest you to check out my following blog. Also to understand this blog, as a prerequisite I would highly recommend you to read the previous blogs of this series.

Indexing

Need for Efficient Search

Let’s start with an example of vectors representing words:

  • King: [0.9, 0.2, 0.8]
  • Queen: [0.9, 0.1, 0.7]
  • Man: [0.7, 0.3, 0.9]
  • Woman: [0.7, 0.2, 0.8]

If we want to find the most similar word to ‘Queen’ (the query), one brute-force method is to compare the query vector with every vector in the database. However, in large datasets with millions of vectors, this approach becomes computationally expensive.

Solution to Fast Search

This is where indexing comes into play. In simple terms, indexing involves organizing vectors into groups based on similarity. Let’s continue with our example:

  • Cluster 1: King and Queen
  • Cluster 2: Man and Woman

If we want to search for the vector most similar to 'Queen,' we can first identify which cluster the query belongs to and then search within that cluster alone. This dramatically reduces the number of comparisons needed, speeding up the search process. Indexes are created when embedding vector arrays need to be stored in a vector database.

Vector Database

A vector database is a specialized database designed to store and search high-dimensional vectors, like embeddings. For example, vectors representing, “King”, “Queen”, etc. are stored into this Database.

These databases facilitate fast and accurate search and retrieval based on vector similarity. For example, if you pass the keyword, “Royal” as a query, these database would return vectors like “King” and “Queen” because their embeddings is similar to that of “Royal”. Some databases are specifically built for vector search, while others offer vector search as an additional functionality.

Key Differences Between Vector DB and Regular DB

Regular Database Vector Database
Stores structured data in columns and rows. Stores high-dimensional vectors representing complex data like documents, images, etc.
Designed for exact matching when retrieving data. Designed for similarity-based queries. For example, querying with "Royal" could return vectors representing "King" and "Queen".

Following is the visual representation of the whole process from a Document to storing embeddings in Vector Database.

Process from a Document to storing embeddings in Vector Database

Citation
I would like to acknowledge that I took help from ChatGPT to structure my blog and simplify content.

. . . . . . . . . . . . . . . . . . . . .