Skip to content

Mychal003/rag-documentation-assistant

Repository files navigation

RAG Documentation Assistant

Intelligent document assistant powered by Retrieval-Augmented Generation (RAG). Upload PDF documents and query them using natural language — the system retrieves relevant passages via vector similarity search and generates precise answers using a large language model.

Table of Contents

Features

  • PDF Document Processing — upload PDF files, extract text, and build a searchable vector index
  • Natural Language Querying — ask questions in plain language and receive contextual answers
  • Automatic Query Classification — questions are classified as factual, procedural, or troubleshooting, each with an optimized prompt template
  • Conversation History — multi-turn conversations with context carried across follow-up questions
  • Source Attribution — every answer includes the specific document chunks used as evidence
  • JWT Authentication — secure user registration, login, and session management
  • Rate Limiting — built-in brute-force protection on authentication endpoints
  • Structured JSON Logging — request/response logging with timestamps, duration, and status codes via RequestLogger middleware
  • Interactive API Documentation — Swagger UI available at /api/docs
  • Docker Production-Ready — multi-stage Docker build (~620 MB image), docker-compose with PostgreSQL, Nginx reverse proxy, health checks, and automatic database migrations on startup
  • Non-root Container — Docker runs as unprivileged appuser for security
  • Dual Run Mode — run locally with python app.py (development) or via Docker Compose (production) with no code changes

Architecture

Architecture Diagram

RAG Pipeline Flow

  1. Document Upload → PDF text extraction (PyMuPDF) → chunking (RecursiveCharacterTextSplitter, 800 tokens, 100 overlap) → embedding (OpenAI text-embedding-3-large) → FAISS vector index
  2. Query → query classification (GPT-4o zero-shot) → vector similarity search (top-10 chunks) → category-specific prompt construction (with conversation history) → answer generation (GPT-4o) → source attribution (top-3 chunks with L2 distance)

Tech Stack

Layer Technology
Backend Python 3.12, Flask 2.3, Gunicorn
LLM OpenAI GPT-4o (generation + classification)
Embeddings OpenAI text-embedding-3-large
Vector Store FAISS (CPU) 1.12
Database PostgreSQL 15 (production/Docker), SQLite (development)
Migrations Alembic 1.13
Authentication JWT (PyJWT)
PDF Processing PyMuPDF
Frontend Vanilla JavaScript, HTML5, CSS3
Reverse Proxy Nginx Alpine
Containerization Docker (multi-stage build), Docker Compose
Logging Structured JSON (RequestLogger middleware)
API Docs OpenAPI 3.0, Swagger UI

Getting Started

Prerequisites

  • Python 3.10+architecture (1).png
  • PostgreSQL 14+ (or SQLite for development)
  • OpenAI API key (platform.openai.com)
  • Docker and Docker Compose (optional, for containerized deployment)

Local Development

  1. Clone the repository

    git clone https://github.com/Mychal003/rag-documentation-assistant.git
    cd rag-documentation-assistant
  2. Create and activate a virtual environment

    python -m venv venv
    source venv/bin/activate     # Linux/macOS
    venv\Scripts\activate        # Windows
  3. Install dependencies

    pip install -r requirements.txt
    pip install -r requirements-dev.txt   # for testing
  4. Configure environment variables

    cp .env.example backend/.env

    Edit backend/.env and set at minimum:

    OPENAI_API_KEY=sk-your-key-here
    SECRET_KEY=<generate with: python -c "import secrets; print(secrets.token_hex(32))">
    JWT_SECRET_KEY=<generate with: python -c "import secrets; print(secrets.token_hex(32))">
    DATABASE_URL=postgresql://postgres:password@localhost:5432/ragdb
  5. Initialize the database

    cd backend
    alembic upgrade head
  6. Start the development server

    python app.py

    The application is available at http://localhost:5000.
    API documentation is available at http://localhost:5000/api/docs.

Docker Deployment

The application ships with a complete Docker setup: multi-stage Dockerfile (~620 MB final image), Docker Compose with three services (backend, PostgreSQL, Nginx), automatic Alembic migrations on startup, and health checks.

  1. Configure production environment

    # Create root .env for Docker Compose variable interpolation
    cp .env.example .env

    Edit .env and set:

    POSTGRES_USER=raguser
    POSTGRES_PASSWORD=<generate-strong-password>
    POSTGRES_DB=ragdb

    Then configure the backend secrets:

    cp .env.example backend/.env.production

    Edit backend/.env.production and set:

    FLASK_ENV=production
    SECRET_KEY=<python -c "import secrets; print(secrets.token_hex(32))">
    JWT_SECRET_KEY=<python -c "import secrets; print(secrets.token_hex(32))">
    OPENAI_API_KEY=sk-your-key-here
    POSTGRES_USER=raguser
    POSTGRES_PASSWORD=<same-password-as-above>
    POSTGRES_DB=ragdb
    DATABASE_URL=postgresql://raguser:<password>@db:5432/ragdb?client_encoding=utf8
    CORS_ORIGINS=http://localhost,http://localhost:80
  2. Build and start all services

    docker-compose up -d --build

    This starts three containers:

    • rag_backend — Flask + Gunicorn (4 workers, 2 threads)
    • rag_db — PostgreSQL 15 Alpine
    • rag_nginx — Nginx reverse proxy

    On first startup, docker-entrypoint.sh automatically runs alembic upgrade head to apply database migrations before starting Gunicorn.

  3. Verify the deployment

    # Check container status (all should be "Up" and "healthy")
    docker-compose ps
    
    # Test health endpoint
    curl http://localhost/health
    
    # Open in browser
    # http://localhost         — Frontend
    # http://localhost/api/docs — Swagger UI
  4. Useful commands

    docker-compose logs -f backend     # Stream backend logs
    docker-compose down                 # Stop all services
    docker-compose down -v              # Stop and remove volumes (resets data)
    docker-compose build --no-cache backend  # Rebuild without cache

The application is exposed on port 80 (HTTP) through Nginx.

API Documentation

Interactive Swagger UI is available at /api/docs when the application is running.

The OpenAPI 3.0 specification file is located at backend/static/swagger.json.

Endpoints Overview

Method Endpoint Description Auth
POST /api/auth/register Register a new user No
POST /api/auth/login Authenticate and obtain JWT token No
GET /api/auth/me Get current user profile Yes
GET /api/conversations List user conversations Yes
POST /api/conversations Create a new conversation Yes
GET /api/conversations/:id Get conversation with messages Yes
DELETE /api/conversations/:id Delete conversation and associated data Yes
POST /api/conversations/:id/upload Upload PDF document Yes
POST /api/conversations/:id/query Ask a question about the document Yes
GET /health Health check No

Authentication

All protected endpoints require a JWT token in the Authorization header:

Authorization: Bearer <token>

Tokens are returned by the register and login endpoints and expire after 24 hours.

Example: Complete Workflow

# 1. Register
curl -X POST http://localhost:5000/api/auth/register \
  -H "Content-Type: application/json" \
  -d '{"username": "demo", "email": "[email protected]", "password": "SecurePass1"}'

# 2. Login (save the token)
TOKEN=$(curl -s -X POST http://localhost:5000/api/auth/login \
  -H "Content-Type: application/json" \
  -d '{"username": "demo", "password": "SecurePass1"}' | jq -r '.token')

# 3. Create a conversation
CONV_ID=$(curl -s -X POST http://localhost:5000/api/conversations \
  -H "Authorization: Bearer $TOKEN" | jq -r '.conversation.id')

# 4. Upload a PDF document
curl -X POST http://localhost:5000/api/conversations/$CONV_ID/upload \
  -H "Authorization: Bearer $TOKEN" \
  -F "[email protected]"

# 5. Ask a question
curl -X POST http://localhost:5000/api/conversations/$CONV_ID/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"question": "What is the maximum operating temperature?"}'

Project Structure

├── backend/
│   ├── app.py                  # Flask application and route definitions
│   ├── auth.py                 # JWT authentication module
│   ├── config.py               # Application configuration
│   ├── models.py               # SQLAlchemy database models
│   ├── alembic.ini             # Alembic configuration
│   ├── migrations/             # Database migration scripts
│   ├── static/
│   │   └── swagger.json        # OpenAPI 3.0 specification
│   └── src/
│       ├── rag_pipeline.py     # RAG pipeline (chunking, retrieval, generation)
│       ├── query_classifier.py # LLM-based query classification
│       ├── conversation_manager.py  # Conversation CRUD operations
│       ├── llm_singleton.py    # Thread-safe LLM/Embeddings singleton
│       ├── pdf_processor.py    # PDF text extraction
│       └── logging_config.py   # Structured logging configuration
├── frontend/
│   ├── index.html              # Single-page application
│   └── static/
│       ├── css/style.css
│       └── js/
│           ├── app.js          # Main application logic
│           └── auth.js         # Authentication UI logic
├── tests/
│   ├── conftest.py             # Test fixtures and configuration
│   ├── test_auth.py            # Authentication endpoint tests
│   ├── test_conversations.py   # Conversation endpoint tests
│   └── test_rag_components.py  # RAG component unit tests
├── nginx/
│   └── nginx.conf              # Nginx reverse proxy configuration
├── docs/
│   └── EVALUATION.md           # RAG evaluation system documentation
├── docker-compose.yml          # Multi-service Docker orchestration
├── Dockerfile                  # Multi-stage Docker build (Python 3.12-slim)
├── docker-entrypoint.sh        # Container startup script (migrations + Gunicorn)
├── .dockerignore               # Files excluded from Docker build context
├── requirements.txt            # Production dependencies
├── requirements-dev.txt        # Development/testing dependencies
├── .env.example                # Environment variable template
└── .env                        # Docker Compose variable interpolation (not committed)

Configuration

All configuration is managed through environment variables defined in backend/.env. See .env.example for the full list of available options.

Variable Required Default Description
OPENAI_API_KEY Yes OpenAI API key
SECRET_KEY Production dev-secret-* Flask secret key for session signing
JWT_SECRET_KEY Production jwt-dev-* Secret key for JWT token signing
DATABASE_URL No sqlite:///database.db Database connection string
FLASK_ENV No development Environment mode (development / production)
CORS_ORIGINS No localhost:5000 Comma-separated list of allowed CORS origins
LOG_LEVEL No INFO Logging verbosity
PORT No 5000 Application port
POSTGRES_USER Docker raguser PostgreSQL username (Docker Compose)
POSTGRES_PASSWORD Docker PostgreSQL password (Docker Compose)
POSTGRES_DB Docker ragdb PostgreSQL database name (Docker Compose)

Testing

The project includes 38 tests covering authentication, conversations, and RAG components.

# Run all tests
cd backend
python -m pytest tests/ -v

# Run with coverage
python -m pytest tests/ --cov=. --cov-report=term-missing

# Run specific test module
python -m pytest tests/test_auth.py -v

Database Migrations

Database schema is managed with Alembic.

cd backend

# Apply all migrations
alembic upgrade head

# Create a new migration after model changes
alembic revision --autogenerate -m "description of changes"

# View current migration state
alembic current

# Rollback one migration
alembic downgrade -1

License

This project was developed as an engineering thesis. All rights reserved.

About

Intelligent document assistant powered by RAG — upload PDFs, ask questions, get cited answers. Built with Flask, OpenAI GPT-4o, FAISS, and vanilla JS.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors