Mandukya AI — Smart Video Lesson Companion
A locally-run knowledge extraction pipeline for educational videos.
| Layer | Technology | Purpose |
|---|---|---|
| Pipeline (Backend) | Python 3.10+ | Core processing: download, transcribe, index, search |
| Audio Transcription | Whisper.cpp | Local audio-to-text transcription |
| Video Processing | FFmpeg / ffmpeg-python | Frame extraction, slide detection |
| Knowledge Graph | Cognee + Kùzu + LanceDB | RDBMS, Vector, and Graph databases |
| LLM / Reasoning | Ollama | Local LLM for insights, diagrams, slide generation |
| Frontend | React 18 + React Three Fiber + Three.js | Split-Helix 3D UI |
| UI Components | Lucide React | Icon library |
| Presentation | Reveal.js | Generated slideshow output |
| Package Management | pip (Python), npm (Frontend) | Dependency management |
| Task Runner | Make | Unified CLI (make setup, make download, etc.) |
- No cloud dependencies for core functionality. All processing runs locally.
- Ollama, Whisper.cpp, and Cognee all operate on the user's machine.
- Never introduce cloud API calls without explicit ADR approval.
The system follows a four-stage pipeline:
Perception → Memory → Reasoning → Presentation
Each stage is independently testable and communicates via file-based artifacts (transcripts, frames, slides) and database records (Cognee knowledge graph).
| Stage | Input | Output |
|---|---|---|
| Perception | YouTube URL / Video file | Transcripts (.txt), Frames (.png) |
| Memory | Transcripts | Knowledge Graph (RDBMS + Vector + Graph DB) |
| Reasoning | Knowledge Graph + Context | Insights, Mermaid diagrams, Slide content |
| Presentation | Slide content | Reveal.js HTML slides |
- Pipeline stages communicate through files on disk, not in-memory state.
transcripts/— Audio transcription outputsdownloads/— Downloaded video/audio filesslides/— Generated Reveal.js presentationsaudio/— Extracted audio tracksdiagrams/— Generated Mermaid/visual diagrams
- Backend: Python pipeline scripts (
process_pipeline.py,indexer.py,search.py, etc.) - Frontend: React SPA in
frontend/directory - Communication: File system + database queries (no REST API layer currently)
- The frontend reads from the same data stores the pipeline writes to.
video_analysis/
├── pipeline/ # Pipeline modules
├── transcripts/ # Generated transcripts
├── downloads/ # Downloaded media
├── slides/ # Generated Reveal.js slides
├── audio/ # Extracted audio files
├── diagrams/ # Generated diagrams
├── frontend/ # React SPA (Split-Helix UI)
│ ├── src/
│ ├── public/
│ └── package.json
├── docs/ # Documentation
│ ├── architecture/ # Architecture diagrams
│ └── adr/ # Architecture Decision Records
├── tests/ # Test suite
├── process_pipeline.py # Main pipeline orchestrator
├── indexer.py # Knowledge graph indexing
├── search.py # Semantic search
├── downloader.py # YouTube download
├── slide.py # Slide generation
├── generate_slide.py # Slide generation helper
├── cognee_setup.py # Cognee configuration
├── cognee_indexer.py # Cognee indexing logic
├── setup_wizard.py # Interactive setup
├── requirements.txt # Python dependencies
├── Makefile # Task runner
└── ARCHITECTURAL_GUARDRAILS.md # This file
- No blocking UI calls — Pipeline scripts must be non-interactive (except
setup_wizard.py). - Idempotent operations — Re-running a stage should not duplicate data.
- Graceful degradation — If Ollama is unavailable, pipeline should fail with clear error, not silently skip.
- Logging — Use
pipeline.logfor structured logging. Noprint()statements in production code.
- No backend server required — Frontend reads directly from file system / database.
- React 18 only — Do not upgrade React major version without ADR.
- Three.js for 3D — All 3D visualization uses React Three Fiber + Three.js.
- No external API calls — Frontend must not call external services.
- Cognee is the source of truth — Knowledge graph data lives in Cognee-managed databases.
- Files are artifacts — Transcripts, slides, and diagrams are outputs, not inputs (except for pipeline consumption).
- No hardcoded paths — Use
.envconfiguration for all file paths.
- Local-first dependencies — Prefer local tools (Whisper.cpp, Ollama) over cloud APIs.
- Pin major versions —
requirements.txtandpackage.jsonshould pin major versions. - New dependency requires ADR — Any new external service or major library addition needs an Architecture Decision Record.
# Setup (one-time)
make setup
# Download a lesson
make download URL="https://www.youtube.com/playlist?list=..."
# Run the pipeline
make run
# Build knowledge graph
make index
# Search
make search QUERY="your query"
# Frontend
cd frontend && npm startAll architectural decisions are tracked in docs/adr/.
See docs/adr/TEMPLATE.md for the format.
System architecture diagrams are in docs/architecture/diagrams/ in Mermaid format.
Start with system-context.mmd for the C4 System Context view.