FastAPI server for document processing and RAG-based questioning.
-
python environment:
python -m venv venv source venv/bin/activate # on windows: venv\Scripts\activate
-
install dependencies:
pip install -r requirements.txt
-
configure environment:
Create a
.envfile:MISTRAL_API_KEY=your_mistral_api_key_here GROQ_API_KEY=your_groq_api_key_here
-
run the server:
python main.py
- mistral api: you need a valid
MISTRAL_API_KEYfor embeddings. - groq api: you need a valid
GROQ_API_KEYfor generation. - models:
- Embeddings:
mistral-embed(handled via API) - Generation:
llama-3.3-70b-versatile(handled via Groq API)
- Embeddings:
POST /session: create a new session; returnssession_id. Use this on app load.POST /upload: upload and index a PDF (query param:session_id). No disk write; state is in memory per session.POST /query: ask questions about the document for the given session (query params:question,session_id).GET /status: optionalsession_idfor per-session index state.GET /health: health check.
backend/
├── main.py # fastapi entry point & routes
├── session_store.py # in-memory session state (chunks, FAISS index)
├── extraction.py # pdf text extraction (stream-based)
├── chunking.py # text segmentation strategies
├── embeddings.py # embeddings & in-memory FAISS index build
├── generation.py # rag logic & llm interaction
├── requirements.txt # python dependencies
└── .env # environment variables (local only)