Retrieval-Augmented Generation · Vector Search · Autonomous AI Agents
A fully production and enterprise app for uploading documents and asking natural language questions—powered by RAG, ChromaDB, and an agent workflow (Planner → Retriever → Reasoner → Validator).
| Area | Description |
|---|---|
| Document ingestion | Upload PDF, TXT, CSV, Excel (.xlsx); automatic parsing, chunking, embedding, and indexing |
| Semantic search | Vector similarity search over chunk embeddings (ChromaDB) |
| RAG pipeline | Retrieve top-k chunks → build prompt → LLM answer with citations |
| LangChain agents | Multi-agent workflow: Planner (planning & safety) → Retriever → Reasoner (LangChain core + LLM) → Validator; execution summary and citations |
| Planning | Planner decides if retrieval is needed and produces a normalized planned_query; plan drives retrieval and reasoning |
| Safety | File validation, size limits, prompt-injection mitigation, “insufficient evidence” fallback |
| Stack | FastAPI backend · React + Vite + Tailwind · LangChain core (agents, messages, BaseChatModel) |
Users interact with the React frontend; the FastAPI backend talks to OpenAI (LLM + embeddings), ChromaDB (vector store), and SQLite (metadata).
flowchart TB
subgraph Users["👤 Users"]
U[User / Browser]
end
subgraph Capstone["Enterprise GenAI Document Q&A (Capstone)"]
subgraph Frontend["Frontend"]
SPA[React SPA · Vite · TypeScript · Tailwind]
end
subgraph Backend["Backend"]
API[FastAPI API /api/v1]
end
end
subgraph External["External Systems"]
OpenAI[OpenAI · LLM + Embeddings]
Chroma[ChromaDB · Vector Store]
SQLite[(SQLite · Metadata)]
end
U <-->|HTTPS / REST| SPA
SPA <-->|/api/v1/*| API
API <-->|Embeddings, Chat| OpenAI
API <-->|Vector CRUD, Search| Chroma
API <-->|Document metadata| SQLite
Questions go through the simple RAG pipeline or the full agent workflow (Planner → Retriever → Reasoner → Validator) based on the “Use agent workflow” toggle.
flowchart LR
subgraph Frontend
Q[User question]
Toggle{Use agent workflow?}
end
subgraph RAGPath["RAG path"]
RAG[POST /query]
RAGService[RAGService]
Search1[vector search]
LLM1[LLM]
Cite1[Citations]
end
subgraph AgentPath["Agent path"]
Agent[POST /agents/query]
Orch[Orchestrator]
P[Planner]
R[Retriever]
Reas[Reasoner]
V[Validator]
end
Q --> Toggle
Toggle -->|No| RAG
Toggle -->|Yes| Agent
RAG --> RAGService --> Search1 --> LLM1 --> Cite1
Agent --> Orch --> P --> R --> Reas --> V
The agent workflow: plan → retrieve chunks → reason with LLM → validate citations and support status.
flowchart TB
Question[User question] --> P[PlannerAgent]
P --> P1[Safety check · needs_retrieval? · planned_query]
P1 --> R[RetrieverAgent]
R --> R1[vector_store.search → Top-k chunks]
R1 --> Reas[ReasonerAgent]
Reas --> Re1[Build context → LangChain core + LLM → Answer]
Re1 --> V[ValidatorAgent]
V --> V1[Check citations · Support status · ValidationResult]
V1 --> Response[AgentQueryResponse + execution_summary]
The agent path uses a structured multi-agent pipeline built with LangChain patterns:
-
Agent structure — Four agents run in sequence:
- Planner — Interprets the question, runs a safety check, and produces a plan (
needs_retrieval,planned_query). Planning determines whether to run retrieval and what query to send to the vector store. - Retriever — Runs vector search with the planned query and returns top-k chunks with scores.
- Reasoner — Uses LangChain core (
langchain_core.messages+ sharedBaseChatModel) to build a system/user prompt and synthesize an answer from the retrieved context. - Validator — Checks that the answer is grounded in the chunks and has citations; sets
support_statusand an execution summary.
- Planner — Interprets the question, runs a safety check, and produces a plan (
-
Planning — The
Planfrom the Planner includesplanned_query(normalized question) andneeds_retrieval. The orchestrator only calls Retriever and Reasoner whenneeds_retrievalis true; otherwise it returns early with a clear message. The frontend receives an execution summary (planned query, chunks retrieved, validation result) for transparency.
More diagrams (layered view, ingestion sequence, backend dependencies, data stores): docs/architecture/architecture-diagram.md
This section is based on the How to Use guide for the Enterprise GenAI Document Q&A system. Upload the suggested document types and try the example questions below.
- What are the strategic pillars of Orion’s AI strategy?
- How much investment is planned for AI initiatives?
- What outcomes are expected from the AI transformation?
- What are the principles of AI governance?
- What risk category would a customer-facing AI system fall under?
- What controls are required for high-risk AI systems?
Upload your cost report (e.g. CSV), then ask:
- Which service has the highest cost?
- How much did AI training cost in March?
- What is the total cost across all months?
- Which issues remain unresolved?
- What is the average resolution time?
- Which category appears most frequently?
- Which industries use Fraud Detection AI?
- What is the revenue from Banking customers?
- Which product was launched most recently?
- What components exist in the system architecture?
- What is the role of the feature store?
- What security controls are required?
Use these to see the agent workflow combine multiple documents:
| # | Question | Documents the agent should combine |
|---|---|---|
| Q1 | How does Orion ensure responsible use of AI according to their strategy and governance documents? | Strategy report + Governance policy |
| Q2 | Which AI product generates the highest revenue and which industry uses it? | Excel product sheet + Customer contracts |
| Q3 | What is the most expensive cloud service and does it align with Orion's AI investment strategy? | cloud_costs.csv + Strategy report |
| Q4 | Which unresolved support issues could impact AI products? | Support tickets + Product info |
Enable “Use agent workflow” on the Ask page so the Planner, Retriever, Reasoner, and Validator work together to answer these questions with citations and an execution summary.
Images below are from the How to use guide (screenshots and step-by-step visuals).
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
CapstoneProject/
├── backend/
│ ├── app/
│ │ ├── api/routes/ # health, documents, query
│ │ ├── core/ # config, logging
│ │ ├── db/ # SQLite metadata store
│ │ ├── models/ # DocumentChunk, ChunkMetadata
│ │ ├── rag/ # RAG pipeline
│ │ ├── agents/ # Planner, Retriever, Reasoner, Validator, Orchestrator
│ │ ├── schemas/ # Pydantic request/response
│ │ ├── services/ # parsing, chunking, embedding, vector_store, ingestion
│ │ ├── utils/ # security, file validation
│ │ └── main.py
│ ├── tests/
│ ├── requirements.txt
│ └── Dockerfile
├── frontend/
│ ├── src/
│ │ ├── components/ # Layout, UploadZone, AnswerPanel
│ │ ├── pages/ # UploadPage, DocumentsPage, AskPage
│ │ ├── lib/api.ts
│ │ └── types/
│ ├── package.json
│ ├── vite.config.ts
│ └── tailwind.config.js
├── docs/
│ ├── architecture/ # Complex architecture diagrams (Mermaid)
│ │ └── architecture-diagram.md
│ ├── readme-images/ # Screenshots from How to Use guide
│ ├── architecture.md
│ ├── api.md
│ └── deployment.md
├── sample_data/
├── docker-compose.yml
└── README.md
- Config:
app/core/config.py— env-based settings (paths, limits, LLM/embedding keys). - Parsing: PDF (pdfplumber), TXT, CSV (pandas), Excel (openpyxl) with page/sheet/row metadata.
- Chunking: Recursive split with configurable size/overlap.
- Embeddings: OpenAI embeddings or local sentence-transformers.
- Vector store: ChromaDB with persistent storage.
- RAG: Retrieve top-k → sanitize context → LLM → citations.
- Agents: Planner → Retriever → Reasoner → Validator; Orchestrator returns execution summary.
- API:
POST /documents/upload,GET /documents,POST /query,POST /agents/query,GET /healthunder/api/v1.
- Stack: React 18, TypeScript, Vite, Tailwind (DM Sans, JetBrains Mono). HTTPS in development: @vitejs/plugin-basic-ssl@1.
- Pages: Upload (drag-and-drop), Documents (list + status), Ask (question + RAG vs agent toggle).
- Answer panel: Answer text, sources/citations, execution summary (for agent), retrieved chunks accordion.
- Python 3.9+
- Node 18+
- (Optional) Docker
Add the app host to /etc/hosts (required for the frontend) before starting the backend:
sudo vi /etc/hostsAdd this line at the end of the file:
127.0.0.1 capstone.genai-rag.edureka.co
Save the file (in vi: press Esc, type :wq, press Enter).
cd backend
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
if above failed then : /Users/xxxx/Workspace/Innovation/CapstoneProject/.venv/bin/python3 -m pip
pip install -r requirements.txt
cp ../.env.example .env # set OPENAI_API_KEY as per the instruction
mkdir -p data/uploads data/chroma
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000- API: http://capstone.genai-rag.edureka.co:8080 or http://localhost:8000
- Swagger: http://capstone.genai-rag.edureka.co:8080/docs or http://localhost:8000/docs
cd frontend
npm installStart the application (port 443 requires elevated privileges):
sudo npm run dev- App: https://capstone.genai-rag.edureka.co (proxies
/apito backend).
"Your connection is not private" (NET::ERR_CERT_AUTHORITY_INVALID): The dev server uses a self-signed certificate. For local development, click "Advanced" → "Proceed to capstone.genai-rag.edureka.co (unsafe)". Only do this on your own machine.
If you get vite: command not found when using sudo, preserve your PATH:
sudo env "PATH=$PATH" npm run devIf you get an npm permission error, fix ownership of your npm cache:
sudo chown -R $(id -u):$(id -g) "$HOME/.npm"Or if the error shows a path like /Users/xxx/.npm, run (replace xxx with your username):
sudo chown -R 501:20 "/Users/xxx/.npm"Next: upload sample data and test
- Open https://capstone.genai-rag.edureka.co and go to the Upload page.
- Upload the sample files from the project’s
data_to_upload/folder (e.g.enterprise_ai_strategy_2026.txt,ai_governance_policy.txt,cloud_costs.csv,support_tickets.csv,system_architecture.txt, and any Excel files). Wait for ingestion to complete (status completed on the Documents page). - Go to the Ask page and try the example questions from Supported document types and example questions and Example complex questions for RAG + agents above (e.g. “What are the strategic pillars of Orion’s AI strategy?”, “How does Orion ensure responsible use of AI according to their strategy and governance documents?”). Toggle “Use agent workflow” on for the multi-document agent questions.
- Upload: Go to Upload, drop a PDF/TXT/CSV/XLSX file (see
data_to_upload/for sample files). - Ask: Go to Ask, type a question, optionally enable “Use agent workflow”, and click Ask.
- Documents: View the list of uploaded files and their status.
| Variable | Description | Default |
|---|---|---|
OPENAI_API_KEY |
OpenAI (or compatible) API key | — |
OPENAI_API_BASE |
Custom API base URL | — |
LLM_MODEL |
Chat model name | gpt-4o-mini |
EMBEDDING_MODEL |
Embedding model | text-embedding-3-small |
USE_LOCAL_EMBEDDINGS |
Use sentence-transformers instead of API | false |
UPLOAD_DIR |
Directory for uploads | data/uploads |
CHROMA_PERSIST_DIR |
ChromaDB persistence path | data/chroma |
MAX_FILE_SIZE_MB |
Max upload size (MB) | 25 |
CHUNK_SIZE / CHUNK_OVERLAP |
Chunking parameters | 1000 / 200 |
TOP_K_RETRIEVE |
Number of chunks to retrieve | 5 |
See .env.example for the full list.
cp .env.example .env
docker compose up --build- Backend: http://localhost:8000
- Frontend: https://capstone.genai-rag.edureka.co (or http://localhost:3000 when not using local HTTPS)
- Data:
rag_datavolume
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/health |
Health check |
| POST | /api/v1/documents/upload |
Upload document (multipart) |
| GET | /api/v1/documents |
List documents |
| GET | /api/v1/documents/{id} |
Document metadata |
| POST | /api/v1/query |
RAG query |
| POST | /api/v1/agents/query |
Agent workflow query |
Details and examples: docs/api.md.
cd backend
pip install -r requirements.txt
pytest -v- Architecture — Components, flows, and narrative
- Architecture diagrams — Complex Mermaid diagrams (system context, layers, ingestion, RAG vs agent, data stores)
- API — Endpoints and request/response examples
- Deployment — Local and Docker deployment
- No authentication; single-user assumption
- ChromaDB only (other vector backends via abstraction)
- No streaming responses; no conversation history
- Document deletion / re-indexing not implemented in UI
- Auth stub, document delete and re-index
- Hybrid retrieval (keyword + semantic), re-ranking
- Feedback (thumbs up/down), streaming, admin logs
- Deploy to Render / Railway / Vercel / AWS
MIT (or your choice).







