Intelligent document assistant powered by Retrieval-Augmented Generation (RAG). Upload PDF documents and query them using natural language — the system retrieves relevant passages via vector similarity search and generates precise answers using a large language model.
- Features
- Architecture
- Tech Stack
- Getting Started
- API Documentation
- Project Structure
- Configuration
- Testing
- Database Migrations
- License
- PDF Document Processing — upload PDF files, extract text, and build a searchable vector index
- Natural Language Querying — ask questions in plain language and receive contextual answers
- Automatic Query Classification — questions are classified as factual, procedural, or troubleshooting, each with an optimized prompt template
- Conversation History — multi-turn conversations with context carried across follow-up questions
- Source Attribution — every answer includes the specific document chunks used as evidence
- JWT Authentication — secure user registration, login, and session management
- Rate Limiting — built-in brute-force protection on authentication endpoints
- Structured JSON Logging — request/response logging with timestamps, duration, and status codes via
RequestLoggermiddleware - Interactive API Documentation — Swagger UI available at
/api/docs - Docker Production-Ready — multi-stage Docker build (~620 MB image),
docker-composewith PostgreSQL, Nginx reverse proxy, health checks, and automatic database migrations on startup - Non-root Container — Docker runs as unprivileged
appuserfor security - Dual Run Mode — run locally with
python app.py(development) or via Docker Compose (production) with no code changes
- Document Upload → PDF text extraction (PyMuPDF) → chunking (RecursiveCharacterTextSplitter, 800 tokens, 100 overlap) → embedding (OpenAI
text-embedding-3-large) → FAISS vector index - Query → query classification (GPT-4o zero-shot) → vector similarity search (top-10 chunks) → category-specific prompt construction (with conversation history) → answer generation (GPT-4o) → source attribution (top-3 chunks with L2 distance)
| Layer | Technology |
|---|---|
| Backend | Python 3.12, Flask 2.3, Gunicorn |
| LLM | OpenAI GPT-4o (generation + classification) |
| Embeddings | OpenAI text-embedding-3-large |
| Vector Store | FAISS (CPU) 1.12 |
| Database | PostgreSQL 15 (production/Docker), SQLite (development) |
| Migrations | Alembic 1.13 |
| Authentication | JWT (PyJWT) |
| PDF Processing | PyMuPDF |
| Frontend | Vanilla JavaScript, HTML5, CSS3 |
| Reverse Proxy | Nginx Alpine |
| Containerization | Docker (multi-stage build), Docker Compose |
| Logging | Structured JSON (RequestLogger middleware) |
| API Docs | OpenAPI 3.0, Swagger UI |
- Python 3.10+architecture (1).png
- PostgreSQL 14+ (or SQLite for development)
- OpenAI API key (platform.openai.com)
- Docker and Docker Compose (optional, for containerized deployment)
-
Clone the repository
git clone https://github.com/Mychal003/rag-documentation-assistant.git cd rag-documentation-assistant -
Create and activate a virtual environment
python -m venv venv source venv/bin/activate # Linux/macOS venv\Scripts\activate # Windows
-
Install dependencies
pip install -r requirements.txt pip install -r requirements-dev.txt # for testing -
Configure environment variables
cp .env.example backend/.env
Edit
backend/.envand set at minimum:OPENAI_API_KEY=sk-your-key-here SECRET_KEY=<generate with: python -c "import secrets; print(secrets.token_hex(32))"> JWT_SECRET_KEY=<generate with: python -c "import secrets; print(secrets.token_hex(32))"> DATABASE_URL=postgresql://postgres:password@localhost:5432/ragdb
-
Initialize the database
cd backend alembic upgrade head -
Start the development server
python app.py
The application is available at http://localhost:5000.
API documentation is available at http://localhost:5000/api/docs.
The application ships with a complete Docker setup: multi-stage Dockerfile (~620 MB final image), Docker Compose with three services (backend, PostgreSQL, Nginx), automatic Alembic migrations on startup, and health checks.
-
Configure production environment
# Create root .env for Docker Compose variable interpolation cp .env.example .envEdit
.envand set:POSTGRES_USER=raguser POSTGRES_PASSWORD=<generate-strong-password> POSTGRES_DB=ragdb
Then configure the backend secrets:
cp .env.example backend/.env.production
Edit
backend/.env.productionand set:FLASK_ENV=production SECRET_KEY=<python -c "import secrets; print(secrets.token_hex(32))"> JWT_SECRET_KEY=<python -c "import secrets; print(secrets.token_hex(32))"> OPENAI_API_KEY=sk-your-key-here POSTGRES_USER=raguser POSTGRES_PASSWORD=<same-password-as-above> POSTGRES_DB=ragdb DATABASE_URL=postgresql://raguser:<password>@db:5432/ragdb?client_encoding=utf8 CORS_ORIGINS=http://localhost,http://localhost:80
-
Build and start all services
docker-compose up -d --build
This starts three containers:
rag_backend— Flask + Gunicorn (4 workers, 2 threads)rag_db— PostgreSQL 15 Alpinerag_nginx— Nginx reverse proxy
On first startup,
docker-entrypoint.shautomatically runsalembic upgrade headto apply database migrations before starting Gunicorn. -
Verify the deployment
# Check container status (all should be "Up" and "healthy") docker-compose ps # Test health endpoint curl http://localhost/health # Open in browser # http://localhost — Frontend # http://localhost/api/docs — Swagger UI
-
Useful commands
docker-compose logs -f backend # Stream backend logs docker-compose down # Stop all services docker-compose down -v # Stop and remove volumes (resets data) docker-compose build --no-cache backend # Rebuild without cache
The application is exposed on port 80 (HTTP) through Nginx.
Interactive Swagger UI is available at /api/docs when the application is running.
The OpenAPI 3.0 specification file is located at backend/static/swagger.json.
| Method | Endpoint | Description | Auth |
|---|---|---|---|
POST |
/api/auth/register |
Register a new user | No |
POST |
/api/auth/login |
Authenticate and obtain JWT token | No |
GET |
/api/auth/me |
Get current user profile | Yes |
GET |
/api/conversations |
List user conversations | Yes |
POST |
/api/conversations |
Create a new conversation | Yes |
GET |
/api/conversations/:id |
Get conversation with messages | Yes |
DELETE |
/api/conversations/:id |
Delete conversation and associated data | Yes |
POST |
/api/conversations/:id/upload |
Upload PDF document | Yes |
POST |
/api/conversations/:id/query |
Ask a question about the document | Yes |
GET |
/health |
Health check | No |
All protected endpoints require a JWT token in the Authorization header:
Authorization: Bearer <token>
Tokens are returned by the register and login endpoints and expire after 24 hours.
# 1. Register
curl -X POST http://localhost:5000/api/auth/register \
-H "Content-Type: application/json" \
-d '{"username": "demo", "email": "[email protected]", "password": "SecurePass1"}'
# 2. Login (save the token)
TOKEN=$(curl -s -X POST http://localhost:5000/api/auth/login \
-H "Content-Type: application/json" \
-d '{"username": "demo", "password": "SecurePass1"}' | jq -r '.token')
# 3. Create a conversation
CONV_ID=$(curl -s -X POST http://localhost:5000/api/conversations \
-H "Authorization: Bearer $TOKEN" | jq -r '.conversation.id')
# 4. Upload a PDF document
curl -X POST http://localhost:5000/api/conversations/$CONV_ID/upload \
-H "Authorization: Bearer $TOKEN" \
-F "[email protected]"
# 5. Ask a question
curl -X POST http://localhost:5000/api/conversations/$CONV_ID/query \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"question": "What is the maximum operating temperature?"}'├── backend/
│ ├── app.py # Flask application and route definitions
│ ├── auth.py # JWT authentication module
│ ├── config.py # Application configuration
│ ├── models.py # SQLAlchemy database models
│ ├── alembic.ini # Alembic configuration
│ ├── migrations/ # Database migration scripts
│ ├── static/
│ │ └── swagger.json # OpenAPI 3.0 specification
│ └── src/
│ ├── rag_pipeline.py # RAG pipeline (chunking, retrieval, generation)
│ ├── query_classifier.py # LLM-based query classification
│ ├── conversation_manager.py # Conversation CRUD operations
│ ├── llm_singleton.py # Thread-safe LLM/Embeddings singleton
│ ├── pdf_processor.py # PDF text extraction
│ └── logging_config.py # Structured logging configuration
├── frontend/
│ ├── index.html # Single-page application
│ └── static/
│ ├── css/style.css
│ └── js/
│ ├── app.js # Main application logic
│ └── auth.js # Authentication UI logic
├── tests/
│ ├── conftest.py # Test fixtures and configuration
│ ├── test_auth.py # Authentication endpoint tests
│ ├── test_conversations.py # Conversation endpoint tests
│ └── test_rag_components.py # RAG component unit tests
├── nginx/
│ └── nginx.conf # Nginx reverse proxy configuration
├── docs/
│ └── EVALUATION.md # RAG evaluation system documentation
├── docker-compose.yml # Multi-service Docker orchestration
├── Dockerfile # Multi-stage Docker build (Python 3.12-slim)
├── docker-entrypoint.sh # Container startup script (migrations + Gunicorn)
├── .dockerignore # Files excluded from Docker build context
├── requirements.txt # Production dependencies
├── requirements-dev.txt # Development/testing dependencies
├── .env.example # Environment variable template
└── .env # Docker Compose variable interpolation (not committed)
All configuration is managed through environment variables defined in backend/.env. See .env.example for the full list of available options.
| Variable | Required | Default | Description |
|---|---|---|---|
OPENAI_API_KEY |
Yes | — | OpenAI API key |
SECRET_KEY |
Production | dev-secret-* |
Flask secret key for session signing |
JWT_SECRET_KEY |
Production | jwt-dev-* |
Secret key for JWT token signing |
DATABASE_URL |
No | sqlite:///database.db |
Database connection string |
FLASK_ENV |
No | development |
Environment mode (development / production) |
CORS_ORIGINS |
No | localhost:5000 |
Comma-separated list of allowed CORS origins |
LOG_LEVEL |
No | INFO |
Logging verbosity |
PORT |
No | 5000 |
Application port |
POSTGRES_USER |
Docker | raguser |
PostgreSQL username (Docker Compose) |
POSTGRES_PASSWORD |
Docker | — | PostgreSQL password (Docker Compose) |
POSTGRES_DB |
Docker | ragdb |
PostgreSQL database name (Docker Compose) |
The project includes 38 tests covering authentication, conversations, and RAG components.
# Run all tests
cd backend
python -m pytest tests/ -v
# Run with coverage
python -m pytest tests/ --cov=. --cov-report=term-missing
# Run specific test module
python -m pytest tests/test_auth.py -vDatabase schema is managed with Alembic.
cd backend
# Apply all migrations
alembic upgrade head
# Create a new migration after model changes
alembic revision --autogenerate -m "description of changes"
# View current migration state
alembic current
# Rollback one migration
alembic downgrade -1This project was developed as an engineering thesis. All rights reserved.
