Skip to content

Shreyojit/Multimodal-Rag-Weaviate-Ollama

Repository files navigation

The multimodal Retrieval-Augmented Generation (RAG) system integrates several advanced AI technologies to provide highly accurate and contextually relevant responses. It utilizes LLaMA 3.3 70B, a powerful instruction-tuned model, to generate human-like responses to user queries. The system employs Weaviate, a vector database, for efficient management and retrieval of multimodal content. Ollama embeddings are used to process various content types, including text, images, audio, and tables, ensuring the system can handle diverse data formats. Additionally, Nomic-Embed-Text embeddings transform textual data into dense vector representations, enabling precise searches. The inclusion of LLaVA-v1.6 enhances the model's ability to synthesize multimodal data, providing enriched responses based on context from different sources.

System Design of Multimodal Rag Rag diag drawio

ScreenShot-1 Screenshot 2024-12-27 204018

Screenshot-2 Screenshot 2024-12-27 203950

Screenshot 3 Screenshot 2024-12-27 203920

Screenshot-4 Screenshot 2024-12-27 203840

ScreenShot-5 Screenshot 2024-12-27 203714

About

The multimodal RAG system combines **LLaMA 3.3 70B** for response generation and **Weaviate** for efficient vector-based content retrieval. It leverages **Ollama** embeddings for processing text, images, audio, and tables, and **Nomic-Embed-Text** for accurate text search. **LLaVA-70B** enhances multimodal context synthesis for richer responses.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors