Agentic RAG for PDF Processing

This project is an Agentic Retrieval-Augmented Generation (RAG) system that processes PDFs using Gemini as the model, LangChain for agent-based orchestration, and ChromaDB for vector storage and Streamlit for deploying an interactive web application. The project is built with GitHub Copilot assistance.

Features

Extracts relevant information from PDFs.
Uses LangChain to create an intelligent agent.
Stores and retrieves document embeddings with ChromaDB.
Employs Gemini for natural language understanding and response generation.
Provides an interactive web interface using Streamlit.

Tech Stack

Model: Gemini
Framework: LangChain
Database: ChromaDB
Web Application: Streamlit
Assistant: GitHub Copilot

Usage

Add PDFs to the data/ folder.
Run the script to process and query them.
Customize the agent’s behavior in agentic-rag.py.

References

This project was inspired and guided by the following resource:

An Improved Langchain RAG Tutorial (v2) by pixegami: This tutorial provided valuable insights into implementing a Retrieval-Augmented Generation system using LangChain and local LLMs.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
agentic-rag.py		agentic-rag.py
get_embedding_function.py		get_embedding_function.py
populate_database.py		populate_database.py
query_data.py		query_data.py
requirements.txt		requirements.txt
setup_folders.py		setup_folders.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic RAG for PDF Processing

Features

Tech Stack

Usage

References

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agentic RAG for PDF Processing

Features

Tech Stack

Usage

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages