Skip to content

Rnamrata/llm-chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– RAG Chatbot System

A production-ready Retrieval-Augmented Generation (RAG) chatbot system that allows you to chat with your documents using local LLMs. Upload files, YouTube videos, or web pages, and have intelligent conversations with the content.

Python Flask LangChain


πŸ“‹ Table of Contents


✨ Features

Core Capabilities

  • πŸ“„ Multi-Source Document Upload - PDF, TXT, Markdown, YouTube, Web pages
  • 🧠 Intelligent Conversation - Context-aware with conversation history
  • πŸ” Semantic Search - Vector-based retrieval with source attribution
  • πŸš€ Automatic Pipeline - One-call upload to storage
  • πŸ’Ύ Session Management - Multiple concurrent conversations

Technical Features

  • πŸ—οΈ Modular Architecture - Clean separation of concerns
  • πŸ”Œ RESTful API - Easy frontend integration
  • 🎯 Local LLM - Privacy-focused with Ollama
  • πŸ“Š Monitoring - Built-in stats and health checks
  • πŸ›‘οΈ Error Handling - Comprehensive error management

πŸ›οΈ Architecture

Client β†’ Flask API β†’ FileManager/ChatSession β†’ DocumentProcessor/LLMManager 
                                            β†’ VectorStore β†’ Ollama

Data Flow:

  1. Upload: Document β†’ Extract β†’ Chunk β†’ Embed β†’ Store
  2. Chat: Question β†’ Retrieve Context β†’ LLM β†’ Answer with Sources

πŸ› οΈ Tech Stack

Component Technology
Backend Flask 3.0+
LLM Ollama (llama3.2)
Embeddings nomic-embed-text
Vector DB ChromaDB
Framework LangChain
PDF Processing PyPDF2
Audio OpenAI Whisper
Video yt-dlp

πŸ“¦ Installation

Prerequisites

  • Python 3.8+
  • Ollama
  • FFmpeg (for YouTube support)

Quick Install

# 1. Install Ollama
# macOS: brew install ollama
# Linux: curl -fsSL https://ollama.ai/install.sh | sh
# Windows: Download from https://ollama.com/download

# 2. Start Ollama and pull models
ollama serve
ollama pull llama3.2
ollama pull nomic-embed-text

# 3. Install FFmpeg
# macOS: brew install ffmpeg
# Linux: sudo apt-get install ffmpeg

# 4. Clone and setup project
git clone https://github.com/Rnamrata/llm-chatbot.git
cd llm-chatbot
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt

# 5. Create directories
mkdir -p uploads/media chroma_db

requirements.txt

flask==3.0.0
flask-cors==4.0.0
langchain==0.1.0
langchain-community==0.0.10
chromadb==0.4.22
openai-whisper==20231117
yt-dlp==2023.12.30
PyPDF2==3.0.1
requests==2.31.0
numpy==1.24.3

πŸš€ Quick Start

# Terminal 1: Start Ollama
ollama serve

# Terminal 2: Start Flask Server
cd llm-chatbot
source .venv/bin/activate
python main.py

# Terminal 3: Test
python test_system.py

Server will be available at: http://localhost:5001


πŸ“‘ API Reference

Base URL: http://localhost:5001

Upload Endpoints

Endpoint Method Body Description
/upload/file POST file: <file> Upload PDF/TXT/MD
/upload/youtube POST {"url": "..."} Transcribe YouTube video
/upload/web POST {"url": "..."} Scrape web page

Chat Endpoints

Endpoint Method Body Description
/chat/new POST - Create new session
/chat POST {"question": "...", "session_id": "..."} Send message
/chat/history/{id} GET - Get conversation history
/chat/session/{id} GET - Get session info
/chat/clear/{id} DELETE - Clear session
/chat/sessions GET - List all sessions
/chat/cleanup POST {"inactive_hours": 24} Remove old sessions

Utility Endpoints

Endpoint Method Description
/health GET Health check
/stats GET Database statistics

Example Responses

Upload Success:

{
  "success": true,
  "chunks_created": 45,
  "filename": "document.pdf"
}

Chat Response:

{
  "success": true,
  "answer": "Machine learning is...",
  "sources": [...],
  "num_sources": 5,
  "session_id": "abc-123"
}

πŸ“ Project Structure

llm-chatbot/
β”œβ”€β”€ main.py                     # Flask app entry point
β”œβ”€β”€ requirements.txt            # Dependencies
β”œβ”€β”€ README.md                   # Documentation
β”‚
β”œβ”€β”€ src/modules/
β”‚   β”œβ”€β”€ file_manager.py         # Upload & processing
β”‚   β”œβ”€β”€ document_processor.py   # Text extraction & chunking
β”‚   β”œβ”€β”€ vector_store_and_embedding.py  # Vector operations
β”‚   β”œβ”€β”€ llm_manager.py          # LLM management
β”‚   └── chat_session.py         # Session management
β”‚
β”œβ”€β”€ test/
β”‚   └── test_system.py          # Test suite
β”‚
β”œβ”€β”€ uploads/                    # Uploaded files
β”‚   └── media/                  # YouTube audio
β”‚
└── chroma_db/                  # Vector database

πŸ’‘ Usage Examples

Example 1: Upload and Chat

import requests

BASE_URL = 'http://localhost:5001'

# Upload PDF
with open('document.pdf', 'rb') as f:
    requests.post(f'{BASE_URL}/upload/file', files={'file': f})

# Create session
response = requests.post(f'{BASE_URL}/chat/new')
session_id = response.json()['session_id']

# Ask question
response = requests.post(f'{BASE_URL}/chat', json={
    'question': 'What is this document about?',
    'session_id': session_id
})
print(response.json()['answer'])

Example 2: YouTube Chat

# Upload YouTube video
requests.post(f'{BASE_URL}/upload/youtube', 
    json={'url': 'https://www.youtube.com/watch?v=...'})

# Chat with transcription
response = requests.post(f'{BASE_URL}/chat',
    json={'question': 'Summarize the video'})

Example 3: cURL Commands

# Upload web page
curl -X POST http://localhost:5001/upload/web \
  -H "Content-Type: application/json" \
  -d '{"url": "https://en.wikipedia.org/wiki/Python_(programming_language)"}'

# Chat
curl -X POST http://localhost:5001/chat \
  -H "Content-Type: application/json" \
  -d '{"question": "What is Python?"}'

# Get stats
curl http://localhost:5001/stats

πŸ§ͺ Testing

Automated Tests

python test_system.py

Manual Testing

Use Postman, cURL, or Python requests to test endpoints. See Usage Examples.


πŸ› Troubleshooting

Issue Solution
"Ollama call failed 404" Run ollama pull llama3.2 and ollama pull nomic-embed-text
"Connection refused" Ensure both Ollama (ollama serve) and Flask (python main.py) are running
"415 Unsupported Media Type" Add header: -H "Content-Type: application/json"
"FFmpeg not found" Install FFmpeg: brew install ffmpeg (macOS) or apt-get install ffmpeg (Linux)
Slow responses Use smaller model (phi3), reduce chunk size, or decrease k parameter
Out of memory Clean up sessions: POST to /chat/cleanup

⚑ Performance Tips

  1. Choose Right Model

    • Fast: phi3
    • Balanced: llama3.2:1b
    • Best: llama3.2 (default)
  2. Optimize Chunk Size

   # In document_processor.py
   chunk_size=500,  # Smaller = faster
   chunk_overlap=50
  1. Reduce Retrieved Chunks
   {"question": "...", "k": 3}  # Default is 5
  1. Regular Cleanup
   # Clean up old sessions daily
   curl -X POST http://localhost:5001/chat/cleanup \
     -H "Content-Type: application/json" \
     -d '{"inactive_hours": 24}'

🚧 Future Enhancements

Planned Features:

  • User authentication & authorization
  • Database persistence (PostgreSQL)
  • Frontend UI (React/Vue)
  • More file formats (DOCX, PPTX, CSV)
  • Streaming responses
  • Document update/deletion
  • Export conversations
  • Docker containerization
  • Analytics dashboard

🀝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/AmazingFeature)
  3. Commit changes (git commit -m 'Add feature')
  4. Push to branch (git push origin feature/AmazingFeature)
  5. Open Pull Request

Code Style:

  • Follow PEP 8
  • Add docstrings
  • Include type hints
  • Write tests

πŸ‘¨β€πŸ’» Author

Namrata Roy


πŸ™ Acknowledgments


Built using Python, LangChain, and Ollama

About

A privacy-focused RAG chatbot for conversing with documents (PDF, YouTube, Web) using local LLMs (Ollama, LangChain, and ChromaDB).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages