This repository contains a Proof of Concept (PoC) for a Retrieval-Augmented Generation (RAG) application, designed to answer user questions about their codebase by leveraging vector-based document retrieval and large language models (LLMs).. It provides the backend and UI for users to ask questions and get responses in real-time.
The system consists of two main parts:
- Embedding vector creation and indexing
- Question-Answering (QA) system
- Vector Database: Qdrant vector database that indexes embeddings for each file in the code base.
- Embeddings Model: We use pre-trained
BAAI/bge-large-en-v1.5model locally over theHuggingface SentenceTransformersto generate embeddings for each file in the code base. - Langchain: Langchain is used exclusively for embedding vector creation, indexing, and vector retrieval. It is not used in the QA system.
- LLM Model: We use
OpenAI/gpt-4omodel for generating responses to user questions. - FastAPI: FastAPI is used to create a REST API for the QA system. It uses chunk streaming to provide responses to the user in real-time.
- React: React is used to build the web interface that allows users to interact with the system.
The embedding vector creation and indexing component is responsible for creating embeddings for each file in the code base and indexing them in a vector store. The vector store is used by the QA system to retrieve relevant documents based on the user's question.
---
title: Embedding vector creation and indexing
---
flowchart LR
B[Load Files from Filesystem] --> C[Parse Files with LanguageParser]
C --> D[Generate Embeddings using Embedding Model]
D --> E[Add Embeddings to VectorStore]
The QA system is responsible for answering user questions based on their code base. It retrieves relevant documents from the vector store, generates a response using the LLM model, and streams the response back to the user in real-time.
---
title: Question-Answering (QA) system
---
sequenceDiagram
User ->>QA Class: Ask a question
QA Class ->> Vector Store: Retrieve relevant documents
activate Vector Store
Vector Store -->> QA Class: Return relevant documents
deactivate Vector Store
QA Class ->> History: Add user message to the history
QA Class ->> History: Add relevant documents to the history
QA Class ->> LLM Model: Generate response using LLM Model
activate LLM Model
loop Response Stream
LLM Model -->> QA Class: Stream response chunks
QA Class -->> User: Provide response
end
deactivate LLM Model
QA Class ->> History: Add response to the history
The repo comes with https://github.com/langchain-ai/langchain parsed and indexed in the vector store. You can run the system using the following steps:
Create a .env.docker file in the ./backend directory with the following environment variables:
OPENAI_API_KEY=xxxxxxxx
QDRANT_URL=http://talk2code-qdrant-1:6333 # URL of the Qdrant vector store for docker containerRun the following commands to start the system:
docker compose up --build -d Note
When you first run the backend container, it downloads the embedding model. This process may take some time depending on your internet connection.
Open your browser and go to http://localhost:8080 to access the web interface.
Ask a question related to langchain and see the system in action!