SafeQueryAI Documentation

Privacy-first document Q&A with local RAG

View the Project on GitHub JYOshiro/SafeQueryAI

Architecture

SafeQueryAI implements document question-answering using a local Retrieval-Augmented Generation (RAG) pipeline and session-based processing.

Architecture Goals

System Overview

Browser (React + TypeScript)
    |
    | HTTP
    v
ASP.NET Core API
    |
    +--> SessionService (session lifecycle)
    +--> FileStorageService (temporary storage)
    +--> TextExtractionService (PDF/CSV extraction)
    +--> DocumentIndexingService (chunk + embed)
    +--> VectorStoreService (in-memory retrieval)
    +--> QuestionAnsweringService (RAG orchestration)
    |
    v
Ollama local LLM runtime (loopback URL only)

RAG Pipeline

  1. User uploads a PDF or CSV file to the active session.
  2. Backend stores the file in a session folder under temporary storage.
  3. Text extraction reads PDF/CSV content.
  4. Document text is chunked and embedded through Ollama (nomic-embed-text).
  5. Embeddings are stored in an in-memory vector store.
  6. On question submission, the question is embedded and top-matching chunks are retrieved.
  7. Backend generates an answer through Ollama (llama3.2) using retrieved context.
  8. If embedding/generation is unavailable, the system falls back to keyword matching.
  9. Session expiry or manual clear removes temporary files and index data.

Components

Backend Structure

Component Responsibility
Controllers HTTP endpoints for files, questions, sessions, health
Services Business logic for indexing, storage, RAG, expiry
Contracts Request/response DTOs
Models Domain entities (SessionInfo, DocumentChunk, etc.)
Interfaces Service abstractions for dependency injection

Key Services

Frontend Structure

Component Purpose
App.tsx Main application component
QuestionForm User input for questions
FileUploadPanel Document upload interface
AnswerPanel Streaming answer display
SessionInfo Session details and management
UploadedFileList List of files in session

Operational Characteristics

Characteristic Current Implementation
Storage model Temporary storage + in-memory session state
Session timeout 60 minutes inactivity
Supported file types PDF, CSV
Upload size policy 20 MB configured limit, 25 MB request ceiling
LLM runtime Local Ollama only
API style REST + SSE stream endpoint

Failover & Resilience

Constraints

Assumptions