paperbrain

NotebookLM-style context-aware research assistant backend. Upload PDFs, chat with papers using RAG, generate podcast summaries, and synthesize insights across multiple papers.

Features

📄 PDF Ingestion: Extract text, metadata, chunk, and embed papers
💬 RAG Chat: Context-aware Q&A with citations
🎙️ Podcast Generation: Audio summaries via ElevenLabs TTS
🎬 Video Scripts: Chapter-based video plans
🎥 Video Generation: Slideshow videos from scripts using ffmpeg
🔄 Multi-Paper Synthesis: Collective storylines and delta analysis

Tech Stack

Runtime: Node 20 + TypeScript
HTTP: Fastify + Zod validation
PDF: pdf-parse
Embeddings: OpenAI / Voyage / Jina (configurable)
LLM: Anthropic Claude 3.5 / Groq Llama-3 70B
TTS: ElevenLabs
Vector Store: In-memory + JSON persistence

Setup

1. Install Dependencies

npm install

2. Configure Environment

Copy .env.example to .env and add your API keys:

cp .env.example .env

Required keys depend on your provider choices:

Embeddings: Set one of OPENAI_API_KEY, VOYAGE_API_KEY, or JINA_API_KEY
LLM: Set ANTHROPIC_API_KEY or GROQ_API_KEY
TTS (optional): Set ELEVENLABS_API_KEY

3. Run Development Server

npm run dev

Server runs on http://localhost:3001

4. Build for Production

npm run build
npm start

API Endpoints

Health Check

curl http://localhost:3001/health

1. Ingest PDF

Upload and process a PDF:

curl -X POST http://localhost:3001/ingest \
  -F "projectId=default" \
  -F "file=@/path/to/paper.pdf"

Response:

{
  "paperId": "paper_abc123",
  "chunks": 42
}

2. Chat with Paper

Ask questions about a paper:

curl -X POST http://localhost:3001/chat \
  -H "Content-Type: application/json" \
  -d '{
    "projectId": "default",
    "paperId": "paper_abc123",
    "messages": [
      {
        "role": "user",
        "content": "What is the main contribution of this paper?"
      }
    ],
    "topK": 8
  }'

Response:

{
  "answer": "The main contribution is... [CIT:paper_abc123#5]",
  "citations": [
    { "paperId": "paper_abc123", "chunkIndex": 5 }
  ]
}

3. Generate Podcast

Create an audio summary:

curl -X POST http://localhost:3001/podcast \
  -H "Content-Type: application/json" \
  -d '{
    "projectId": "default",
    "paperId": "paper_abc123",
    "duration": 180,
    "style": "explainer"
  }'

Response:

{
  "url": "./data/audio/paper_abc123.mp3",
  "bytesLength": 524288
}

4. Generate Video Script

Create a video plan with chapters:

curl -X POST http://localhost:3001/video-script \
  -H "Content-Type: application/json" \
  -d '{
    "projectId": "default",
    "paperId": "paper_abc123"
  }'

Response:

{
  "title": "Understanding Neural Networks",
  "hook": "What if machines could learn like humans?",
  "chapters": [
    {
      "t": 0,
      "heading": "Introduction",
      "bulletPoints": ["Neural networks mimic brain structure"]
    }
  ],
  "outro": "The future of AI is here"
}

5. Generate Video

Create an actual MP4 video from the video script:

curl -X POST http://localhost:3001/generate-video \
  -H "Content-Type: application/json" \
  -d '{
    "projectId": "default",
    "paperId": "paper_abc123",
    "videoScript": {
      "title": "Understanding Neural Networks",
      "hook": "What if machines could learn like humans?",
      "chapters": [
        {
          "t": 0,
          "heading": "Introduction",
          "bulletPoints": ["Neural networks mimic brain", "Inspired by human cognition"]
        }
      ],
      "outro": "Thanks for watching"
    },
    "duration": 10
  }'

Response:

{
  "videoPath": "./data/video/paper_abc123.mp4",
  "duration": 21,
  "slides": 5
}

6. Synthesize Multiple Papers

Generate collective insights:

curl -X POST http://localhost:3001/synthesize \
  -H "Content-Type: application/json" \
  -d '{
    "projectId": "default",
    "paperIds": ["paper_abc123", "paper_def456", "paper_ghi789"]
  }'

Response:

{
  "folderId": "default",
  "storyline": "These papers collectively explore...",
  "deltas": "Paper 1 focuses on X, while Paper 2 improves Y...",
  "tableMarkdown": "| Paper | Method | Results | Delta |\n|-------|--------|---------|-------|..."
}

Quick Test Flow

Here's a complete test sequence:

# 1. Ingest a paper
PAPER_ID=$(curl -X POST http://localhost:3001/ingest \
  -F "projectId=test" \
  -F "[email protected]" | jq -r '.paperId')

echo "Paper ID: $PAPER_ID"

# 2. Ask a question
curl -X POST http://localhost:3001/chat \
  -H "Content-Type: application/json" \
  -d "{
    \"projectId\": \"test\",
    \"paperId\": \"$PAPER_ID\",
    \"messages\": [{\"role\": \"user\", \"content\": \"What problem does this paper solve?\"}]
  }" | jq

# 3. Generate podcast
curl -X POST http://localhost:3001/podcast \
  -H "Content-Type: application/json" \
  -d "{
    \"projectId\": \"test\",
    \"paperId\": \"$PAPER_ID\",
    \"duration\": 120
  }" | jq

# 4. Synthesize (using same paper twice for demo)
curl -X POST http://localhost:3001/synthesize \
  -H "Content-Type: application/json" \
  -d "{
    \"projectId\": \"test\",
    \"paperIds\": [\"$PAPER_ID\", \"$PAPER_ID\"]
  }" | jq

Project Structure

paperbrain/
├── src/
│   ├── server.ts           # Fastify bootstrap
│   ├── env.ts              # Environment validation
│   ├── types.ts            # TypeScript types
│   ├── pdf.ts              # PDF extraction
│   ├── chunk.ts            # Text chunking
│   ├── rag.ts              # RAG retrieval + MMR
│   ├── prompts.ts          # LLM prompt templates
│   ├── embed/              # Embedding providers
│   │   ├── index.ts
│   │   ├── openai.ts
│   │   ├── voyage.ts
│   │   └── jina.ts
│   ├── llm/                # LLM providers
│   │   ├── index.ts
│   │   ├── anthropic.ts
│   │   └── groq.ts
│   ├── store/              # Vector store
│   │   └── memory.ts
│   ├── routes/             # API routes
│   │   ├── ingest.ts
│   │   ├── chat.ts
│   │   ├── podcast.ts
│   │   ├── video-script.ts
│   │   └── synth.ts
│   └── utils/              # Utilities
│       ├── cosine.ts
│       ├── id.ts
│       └── logger.ts
├── data/                   # JSON store + audio files
├── package.json
├── tsconfig.json
├── .env.example
└── README.md

Data Storage

Papers and embeddings are stored as JSON files in ./data/:

./data/{projectId}.json - Papers and chunks with embeddings
./data/audio/{paperId}.mp3 - Generated podcast audio

Format:

{
  "papers": [
    {
      "id": "paper_abc123",
      "title": "...",
      "authors": ["..."],
      "year": 2024
    }
  ],
  "chunks": [
    {
      "id": "paper_abc123_chunk_0",
      "paperId": "paper_abc123",
      "text": "...",
      "index": 0,
      "tokens": 1200,
      "embedding": [0.1, 0.2, ...]
    }
  ]
}

Configuration

Embedding Providers

Set EMBEDDINGS_PROVIDER in .env:

openai - OpenAI text-embedding-3-small (1536 dims)
voyage - Voyage voyage-3-lite (1024 dims)
jina - Jina jina-embeddings-v3 (1024 dims)

LLM Providers

Set LLM_PROVIDER in .env:

anthropic - Claude 3.5 Sonnet (recommended for synthesis)
groq - Llama-3.1 70B (fast, good for Q&A)

Chunking Strategy

Target: 1200 tokens per chunk
Overlap: 200 tokens
Preserves paragraph boundaries
Falls back to sentence splitting for large paragraphs

RAG Pipeline

Embed query using configured provider
Cosine search for top-K chunks (default K=8)
MMR diversification to final-K (default 5)
Build context with inline citations [CIT:paperId#index]
LLM generation with system prompt enforcing citations

Future Enhancements

Swap to Supabase + pgvector for production scale
Add streaming responses for chat
Support more PDF formats (OCR for scanned papers)
Add paper metadata extraction from APIs (Semantic Scholar, arXiv)
Implement caching for embeddings
Add rate limiting and authentication

License

MIT

Built for hackathons. Keep it simple, keep it fast. 🚀

Name		Name	Last commit message	Last commit date
parent directory ..
src		src
.env.example		.env.example
.env.test		.env.test
.gitignore		.gitignore
API_README.md		API_README.md
COMPLETION_CHECKLIST.md		COMPLETION_CHECKLIST.md
FRONTEND_INTEGRATION.md		FRONTEND_INTEGRATION.md
MIGRATION.md		MIGRATION.md
OPTION1_COMPLETE.md		OPTION1_COMPLETE.md
PROJECT_SUMMARY.md		PROJECT_SUMMARY.md
QUICKSTART.md		QUICKSTART.md
QUICKSTART_V1.md		QUICKSTART_V1.md
README.md		README.md
REFACTORING_SUMMARY.md		REFACTORING_SUMMARY.md
TEST_RESULTS.md		TEST_RESULTS.md
VIDEO_GENERATION.md		VIDEO_GENERATION.md
package-lock.json		package-lock.json
package.json		package.json
test-api.sh		test-api.sh
test.sh		test.sh
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

paperbrain

Features

Tech Stack

Setup

1. Install Dependencies

2. Configure Environment

3. Run Development Server

4. Build for Production

API Endpoints

Health Check

1. Ingest PDF

2. Chat with Paper

3. Generate Podcast

4. Generate Video Script

5. Generate Video

6. Synthesize Multiple Papers

Quick Test Flow

Project Structure

Data Storage

Configuration

Embedding Providers

LLM Providers

Chunking Strategy

RAG Pipeline

Future Enhancements

License

FilesExpand file tree

paperbrain

Directory actions

More options

Directory actions

More options

Latest commit

History

paperbrain

Folders and files

parent directory

README.md

paperbrain

Features

Tech Stack

Setup

1. Install Dependencies

2. Configure Environment

3. Run Development Server

4. Build for Production

API Endpoints

Health Check

1. Ingest PDF

2. Chat with Paper

3. Generate Podcast

4. Generate Video Script

5. Generate Video

6. Synthesize Multiple Papers

Quick Test Flow

Project Structure

Data Storage

Configuration

Embedding Providers

LLM Providers

Chunking Strategy

RAG Pipeline

Future Enhancements

License