This platform leverages advanced semantic search algorithms, including Ollama LLM, Langchain, and LangGraph, to analyze and retrieve relevant data from a diverse content repository. After processing the input prompt, the system performs a semantic search using pgvector to fetch pertinent content, which serves as references for the language model to enhance output accuracy and relevance.
The platform efficiently handles various input formats such as images, YouTube links, and PDF files. For video content, metadata is fetched via the YouTube API and then summarized using Ollama LLM to provide concise and relevant overviews. Similarly, the system is equipped to summarize content from images, ensuring a comprehensive and versatile user experience across media types.
In addition to semantic processing, the platform includes image generation through the Stable Diffusion algorithm, enriching the interaction by providing visual content based on textual prompts. Generated images are stored securely in Pinata (IPFS), and Postgres with pgAI integration supports a robust chat history feature, enabling seamless continuity and easy referencing of prior conversations.
asyncfunctiongetOllamaEmbedding(text:string):Promise<number[]>{constembeddings=newOllamaEmbeddings({model:"mxbai-embed-large",baseUrl:"http://localhost:11434",});constvectors=awaitembeddings.embedQuery(text);console.log("Embedding from Ollama:",vectors);returnvectors;}
Vector Search:
importpgfrom"pg";const{Pool}=pg;exportconstpool=newPool({user:process.env.PG_USER,host:process.env.PG_HOST,database:process.env.PG_DATABASE,password:process.env.PG_PASSWORD,port:process.env.PG_PORT,});exportasyncfunctionvectorSearch(message:string){constembedding=awaitgetOllamaEmbedding(message);constembeddingString=`[${embedding.join(",")}]`;constresult=awaitpool.query(`
SELECT id, email,content,
(embedding <-> $1::vector) AS distance
FROM fileTable
ORDER BY distance
LIMIT 5;
`,[embeddingString]);returnresult.rows;}
Store File content into vectorstore:
exportasyncfunctionsetEmbedding(combinedContent:Document[],email:string){try{constembeddings=newOllamaEmbeddings({model:"mxbai-embed-large",baseUrl:"http://localhost:11434",});constvectors=awaitembeddings.embedDocuments(combinedContent.map((chunk)=>chunk.pageContent.replace(/\n/g,"")));constquery=`
INSERT INTO fileTable (email, embedding,content)
VALUES ($1, $2::vector, $3)
`;for (leti=0;i<combinedContent.length;i++){awaitpool.query(query,[email,vectors[i],// embedding for the current chunkcombinedContent[i].pageContent,// original content]);}}catch (err){console.error("Failed to store question:",err);}}
This platform utilizes advanced semantic search algorithms to analyze and retrieve relevant data from this diverse content repository. After processing the input prompt, the system performs a semantic search to fetch pertinent content, which is then used as references for the language model to enhance the accuracy and relevance of its outputs.
The platform is designed to handle inputs across a diverse formats including images, YouTube links, and pdf files. For videos, it fetches metadata through the YouTube API, which is then effectively summarized using a LLM to provide concise and relevant content overviews. Similarly, the system is equipped to summarize content from images, ensuring a comprehensive and versatile user experience that accommodates various types of media.
Additionally, the platform incorporates the capability to generate images using the Stable Diffusion algorithm, further enriching the user interaction experience by providing visual content generation based on textual prompts…
pgvector: pgvector stores the embeddings for content across various formats, including text, pdf. It supports similarity-based searches directly within PostgreSQL, enabling efficient and accurate content retrieval.
Feature: Used in the vector search function to retrieve the closest matching content based on embeddings.
pgAI: pgAI enhances PostgreSQL with AI-driven capabilities, such as processing similarity scores and providing feedback. It integrates directly within SQL queries to score or rank content relevance, improving the quality of search results.
Feature: Applied in querying and sorting results based on similarity distances to optimize semantic search accuracy.
Ollama: Ollama is used for generating text embeddings that represent content meaning and for summarizing media files. It supports various tasks, including embedding generation for content similarity and text summarization for complex media formats.
Feature: Used in both the embedding model and language model functions to process and summarize text and video content effectively.
Final Thoughts
In summary, this platform seamlessly integrates advanced semantic search and multimedia processing capabilities, powered by tools like Ollama LLM, Langchain, and LangGraph. With robust support from pgvector and pgAI for vector-based search and retrieval, it handles diverse media inputs, enhances content relevancy, and offers visually rich interactions through Stable Diffusion. The platform’s comprehensive architecture ensures continuity, user engagement, and a smooth, versatile experience across formats.
Prize Categories:
Open-source Models from Ollama, Vectorizer Vibe, All the Extensions
Thank you for taking the time to explore my project!