NotebookLM-style context-aware research assistant backend. Upload PDFs, chat with papers using RAG, generate podcast summaries, and synthesize insights across multiple papers.
- 📄 PDF Ingestion: Extract text, metadata, chunk, and embed papers
- 💬 RAG Chat: Context-aware Q&A with citations
- 🎙️ Podcast Generation: Audio summaries via ElevenLabs TTS
- 🎬 Video Scripts: Chapter-based video plans
- 🎥 Video Generation: Slideshow videos from scripts using ffmpeg
- 🔄 Multi-Paper Synthesis: Collective storylines and delta analysis
- Runtime: Node 20 + TypeScript
- HTTP: Fastify + Zod validation
- PDF: pdf-parse
- Embeddings: OpenAI / Voyage / Jina (configurable)
- LLM: Anthropic Claude 3.5 / Groq Llama-3 70B
- TTS: ElevenLabs
- Vector Store: In-memory + JSON persistence
npm installCopy .env.example to .env and add your API keys:
cp .env.example .envRequired keys depend on your provider choices:
- Embeddings: Set one of
OPENAI_API_KEY,VOYAGE_API_KEY, orJINA_API_KEY - LLM: Set
ANTHROPIC_API_KEYorGROQ_API_KEY - TTS (optional): Set
ELEVENLABS_API_KEY
npm run devServer runs on http://localhost:3001
npm run build
npm startcurl http://localhost:3001/healthUpload and process a PDF:
curl -X POST http://localhost:3001/ingest \
-F "projectId=default" \
-F "file=@/path/to/paper.pdf"Response:
{
"paperId": "paper_abc123",
"chunks": 42
}Ask questions about a paper:
curl -X POST http://localhost:3001/chat \
-H "Content-Type: application/json" \
-d '{
"projectId": "default",
"paperId": "paper_abc123",
"messages": [
{
"role": "user",
"content": "What is the main contribution of this paper?"
}
],
"topK": 8
}'Response:
{
"answer": "The main contribution is... [CIT:paper_abc123#5]",
"citations": [
{ "paperId": "paper_abc123", "chunkIndex": 5 }
]
}Create an audio summary:
curl -X POST http://localhost:3001/podcast \
-H "Content-Type: application/json" \
-d '{
"projectId": "default",
"paperId": "paper_abc123",
"duration": 180,
"style": "explainer"
}'Response:
{
"url": "./data/audio/paper_abc123.mp3",
"bytesLength": 524288
}Create a video plan with chapters:
curl -X POST http://localhost:3001/video-script \
-H "Content-Type: application/json" \
-d '{
"projectId": "default",
"paperId": "paper_abc123"
}'Response:
{
"title": "Understanding Neural Networks",
"hook": "What if machines could learn like humans?",
"chapters": [
{
"t": 0,
"heading": "Introduction",
"bulletPoints": ["Neural networks mimic brain structure"]
}
],
"outro": "The future of AI is here"
}Create an actual MP4 video from the video script:
curl -X POST http://localhost:3001/generate-video \
-H "Content-Type: application/json" \
-d '{
"projectId": "default",
"paperId": "paper_abc123",
"videoScript": {
"title": "Understanding Neural Networks",
"hook": "What if machines could learn like humans?",
"chapters": [
{
"t": 0,
"heading": "Introduction",
"bulletPoints": ["Neural networks mimic brain", "Inspired by human cognition"]
}
],
"outro": "Thanks for watching"
},
"duration": 10
}'Response:
{
"videoPath": "./data/video/paper_abc123.mp4",
"duration": 21,
"slides": 5
}Generate collective insights:
curl -X POST http://localhost:3001/synthesize \
-H "Content-Type: application/json" \
-d '{
"projectId": "default",
"paperIds": ["paper_abc123", "paper_def456", "paper_ghi789"]
}'Response:
{
"folderId": "default",
"storyline": "These papers collectively explore...",
"deltas": "Paper 1 focuses on X, while Paper 2 improves Y...",
"tableMarkdown": "| Paper | Method | Results | Delta |\n|-------|--------|---------|-------|..."
}Here's a complete test sequence:
# 1. Ingest a paper
PAPER_ID=$(curl -X POST http://localhost:3001/ingest \
-F "projectId=test" \
-F "[email protected]" | jq -r '.paperId')
echo "Paper ID: $PAPER_ID"
# 2. Ask a question
curl -X POST http://localhost:3001/chat \
-H "Content-Type: application/json" \
-d "{
\"projectId\": \"test\",
\"paperId\": \"$PAPER_ID\",
\"messages\": [{\"role\": \"user\", \"content\": \"What problem does this paper solve?\"}]
}" | jq
# 3. Generate podcast
curl -X POST http://localhost:3001/podcast \
-H "Content-Type: application/json" \
-d "{
\"projectId\": \"test\",
\"paperId\": \"$PAPER_ID\",
\"duration\": 120
}" | jq
# 4. Synthesize (using same paper twice for demo)
curl -X POST http://localhost:3001/synthesize \
-H "Content-Type: application/json" \
-d "{
\"projectId\": \"test\",
\"paperIds\": [\"$PAPER_ID\", \"$PAPER_ID\"]
}" | jqpaperbrain/
├── src/
│ ├── server.ts # Fastify bootstrap
│ ├── env.ts # Environment validation
│ ├── types.ts # TypeScript types
│ ├── pdf.ts # PDF extraction
│ ├── chunk.ts # Text chunking
│ ├── rag.ts # RAG retrieval + MMR
│ ├── prompts.ts # LLM prompt templates
│ ├── embed/ # Embedding providers
│ │ ├── index.ts
│ │ ├── openai.ts
│ │ ├── voyage.ts
│ │ └── jina.ts
│ ├── llm/ # LLM providers
│ │ ├── index.ts
│ │ ├── anthropic.ts
│ │ └── groq.ts
│ ├── store/ # Vector store
│ │ └── memory.ts
│ ├── routes/ # API routes
│ │ ├── ingest.ts
│ │ ├── chat.ts
│ │ ├── podcast.ts
│ │ ├── video-script.ts
│ │ └── synth.ts
│ └── utils/ # Utilities
│ ├── cosine.ts
│ ├── id.ts
│ └── logger.ts
├── data/ # JSON store + audio files
├── package.json
├── tsconfig.json
├── .env.example
└── README.md
Papers and embeddings are stored as JSON files in ./data/:
./data/{projectId}.json- Papers and chunks with embeddings./data/audio/{paperId}.mp3- Generated podcast audio
Format:
{
"papers": [
{
"id": "paper_abc123",
"title": "...",
"authors": ["..."],
"year": 2024
}
],
"chunks": [
{
"id": "paper_abc123_chunk_0",
"paperId": "paper_abc123",
"text": "...",
"index": 0,
"tokens": 1200,
"embedding": [0.1, 0.2, ...]
}
]
}Set EMBEDDINGS_PROVIDER in .env:
openai- OpenAItext-embedding-3-small(1536 dims)voyage- Voyagevoyage-3-lite(1024 dims)jina- Jinajina-embeddings-v3(1024 dims)
Set LLM_PROVIDER in .env:
anthropic- Claude 3.5 Sonnet (recommended for synthesis)groq- Llama-3.1 70B (fast, good for Q&A)
- Target: 1200 tokens per chunk
- Overlap: 200 tokens
- Preserves paragraph boundaries
- Falls back to sentence splitting for large paragraphs
- Embed query using configured provider
- Cosine search for top-K chunks (default K=8)
- MMR diversification to final-K (default 5)
- Build context with inline citations
[CIT:paperId#index] - LLM generation with system prompt enforcing citations
- Swap to Supabase + pgvector for production scale
- Add streaming responses for chat
- Support more PDF formats (OCR for scanned papers)
- Add paper metadata extraction from APIs (Semantic Scholar, arXiv)
- Implement caching for embeddings
- Add rate limiting and authentication
MIT
Built for hackathons. Keep it simple, keep it fast. 🚀