This folder is a proof-of-concept scaffold for a fully automated microlearning content pipeline:
Reddit ingest → topic ranking → human approval → content packet → (video) → publish
The goal is to standardize the handoff contracts between components so Petrarch (scraper) and Quimbot (ranker/generator) can interoperate cleanly.
schemas/— canonical JSON Schemas for pipeline artifactsexamples/— example JSON/JSONL recordsnotes/— implementation notes / scoring heuristicsdocs/— comprehensive pipeline design documentationAUTOMATION_BEST_PRACTICES.md— Complete architecture, quality gates, success metricsREDDIT_TO_VEO_PIPELINE.md— Trend discovery → visual storytelling methodologyCANDIDATE_TOPICS.md— 10 validated topics with Veo prompts and scripts
-
Topic record (
topic_record)- One record per candidate topic (typically 1 Reddit post).
- Produced by the ingest + enrich steps.
-
Shortlist record (
shortlist_record)- A ranked list of topics for a given day/run.
-
Approval record (
approval_record)- Captures the human gate decision.
-
Content packet (
content_packet)- The script + metadata used by video generation and publishing.
- Prefer JSONL for large collections (e.g., ingest output).
- Prefer JSON for single documents (shortlist, content packet).