Docify - Local-First AI Second Brain

Your personal research assistant that remembers everything you've ever read.

Docify is an open-source, local-first AI application that lets you upload any resource (PDFs, URLs, documents, images, code), ask questions about them, and receive cited, grounded answers—all while keeping your data completely private.

✨ Key Features

🔒 Privacy-First: All processing happens locally (embeddings, LLM, storage)
🧠 Smart Deduplication: Content-based fingerprinting prevents duplicate processing
📚 Multi-Format Support: PDF, URL, Word, Excel, Markdown, images (OCR), code, and more
💬 Cited Answers: Every response includes citations to source documents
🔍 Hybrid Search: Combines semantic (vector) and keyword (BM25) search
🤖 Local LLM: Runs Mistral 7B via Ollama (optional cloud LLM support)
🌐 Workspace Model: Personal, team, or hybrid collaboration
🚀 One-Command Setup: Docker Compose orchestration

🏗️ Architecture Overview

Docify's RAG pipeline integrates 11 core services:

Resource Ingestion - Upload, parse, deduplicate
Chunking - Semantic boundary preservation
Embeddings (Async) - Vector generation via Celery
Query Expansion - Better recall with variants
Hybrid Search - Semantic + keyword (BM25)
Re-Ranking - 5-factor scoring + conflict detection
Context Assembly - Token budget management
Prompt Engineering - Anti-hallucination prompts
LLM Service - Ollama/OpenAI/Anthropic support
Citation Verification - Verify claims against sources
Message Generation - Full pipeline orchestration

See ARCHITECTURE.md for complete technical details.

🚀 Quick Start

Prerequisites

Docker & Docker Compose
8GB RAM minimum (16GB recommended)
20GB disk space (for models and data)

One-Command Setup ⚡

macOS / Linux:

# Clone the repository
git clone https://github.com/keshavashiya/docify.git
cd docify

# Run the setup script (handles everything!)
./scripts/setup.sh

Windows (PowerShell):

# Clone the repository
git clone https://github.com/keshavashiya/docify.git
cd docify

# Run the setup script (handles everything!)
.\scripts\setup.ps1

That's it! The setup script will:

✅ Check prerequisites (Docker, memory, disk space)
✅ Create environment configuration
✅ Start all Docker services
✅ Initialize the database with pgvector
✅ Download AI models (~4GB, may take 10-15 min)
✅ Verify everything is working

Options:

# macOS / Linux
./scripts/setup.sh --skip-models  # Skip model download (faster setup)
./scripts/setup.sh --reset        # Reset everything and start fresh
./scripts/setup.sh --help         # Show all options

# Windows (PowerShell)
.\scripts\setup.ps1 -SkipModels   # Skip model download
.\scripts\setup.ps1 -Reset        # Reset everything
.\scripts\setup.ps1 -Help         # Show all options

After Setup

macOS / Linux:

./scripts/start.sh           # Start Docify (quick start for daily use)
./scripts/start.sh --logs    # Start and follow logs
./scripts/start.sh --stop    # Stop all services
./scripts/start.sh --status  # Show service status

Windows (PowerShell):

.\scripts\start.ps1           # Start Docify
.\scripts\start.ps1 -Logs     # Start and follow logs
.\scripts\start.ps1 -Stop     # Stop all services
.\scripts\start.ps1 -Status   # Show service status

Access

Frontend: http://localhost:3000
API Docs & Testing: http://localhost:8000/docs
Health Endpoint: http://localhost:8000/api/health

Verify Setup

# Check if all containers are running
docker-compose ps

# Test API health
curl http://localhost:8000/api/health

# Monitor system resources
docker stats docify-ollama docify-backend

# View logs
docker-compose logs -f backend
docker-compose logs -f celery-worker

📋 Manual Setup (Advanced Users)

If you prefer to run each step manually:

# Clone and enter directory
git clone https://github.com/keshavashiya/docify.git
cd docify

# Copy environment configuration
cp .env.example .env

# Start all services
docker-compose up -d --build

# Wait for services to be healthy (~2-3 minutes)
docker-compose ps

# Initialize database (one-time setup)
docker-compose exec postgres psql -U docify -d docify -c "CREATE EXTENSION IF NOT EXISTS vector"
docker-compose exec backend alembic upgrade head

# Download optimized models (one-time, ~4GB total)
docker-compose exec ollama ollama pull mistral:7b-instruct-q4_0
docker-compose exec ollama ollama pull all-minilm:22m

# Restart services with models loaded
docker-compose restart backend celery-worker

🛠️ Local Development

Backend (Python/FastAPI)

cd backend

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Start development server (requires running docker-compose services)
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Frontend (React/TypeScript)

cd frontend

# Install dependencies
npm install

# Start development server
npm run dev

📦 Tech Stack

Backend

FastAPI (Python 3.10+)
PostgreSQL 15+ with pgvector
Celery + Redis (async tasks)
Ollama (local LLM: mistral:7b-instruct-q4_0, all-minilm:22m)
sentence-transformers optional (OpenAI/Anthropic support)

Frontend

React 18+ with TypeScript
Vite, Tailwind CSS
React Query, Zustand

Infrastructure

Docker & Docker Compose
Alembic (database migrations)

📖 API Usage

Upload a Resource

curl -X POST "http://localhost:8000/api/resources/upload" \
  -F "file=@research_paper.pdf" \
  -F "workspace_id=<your-workspace-id>"

Search

curl -X POST "http://localhost:8000/api/search" \
  -H "Content-Type: application/json" \
  -d '{"query": "What is RAG?", "workspace_id": "<id>"}'

Ask Questions

curl -X POST "http://localhost:8000/api/conversations/<id>/messages" \
  -H "Content-Type: application/json" \
  -d '{"content": "Explain the main findings", "role": "user"}'

🐳 Docker & Troubleshooting

Common Commands

# Start all services
docker-compose up -d

# View logs (all services)
docker-compose logs -f

# View logs for specific service
docker-compose logs -f backend
docker-compose logs -f celery-worker

# Stop all services
docker-compose down

# Stop and remove data (WARNING: deletes all data)
docker-compose down -v

# Restart specific service
docker-compose restart backend

Port Conflicts

If you get "port already in use" errors:

# PostgreSQL: Docify uses 5433 (standard is 5432)
# Redis: Docify uses 6380 (standard is 6379)
# Backend: Docify uses 8000
# Frontend: Docify uses 3000
# Ollama: Docify uses 11434

# Check what's using a port (macOS/Linux)
lsof -i :8000

# Kill process (if needed)
kill -9 <PID>

Manual API Testing

Use the built-in API documentation:

Open http://localhost:8000/docs in your browser
Try requests directly in Swagger UI
All endpoints are documented with request/response schemas

Alternatively, use curl:

# Health check
curl http://localhost:8000/api/health

# List workspaces
curl http://localhost:8000/api/workspaces

# Create workspace
curl -X POST http://localhost:8000/api/workspaces \
  -H "Content-Type: application/json" \
  -d '{"name":"My Workspace","workspace_type":"personal"}'

📄 License

MIT License - see LICENSE file for details

🙏 Acknowledgments

Built with FastAPI
Powered by Ollama
Vector search by pgvector
Embeddings by sentence-transformers

Made with ❤️ for researchers, students, and knowledge workers

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
backend		backend
frontend		frontend
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
Docify - AI Second Brain.png		Docify - AI Second Brain.png
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Docify - Local-First AI Second Brain

✨ Key Features

🏗️ Architecture Overview

🚀 Quick Start

Prerequisites

One-Command Setup ⚡

After Setup

Access

Verify Setup

🛠️ Local Development

Backend (Python/FastAPI)

Frontend (React/TypeScript)

📦 Tech Stack

📖 API Usage

Upload a Resource

Search

Ask Questions

🐳 Docker & Troubleshooting

Common Commands

Port Conflicts

Manual API Testing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Docify - Local-First AI Second Brain

✨ Key Features

🏗️ Architecture Overview

🚀 Quick Start

Prerequisites

One-Command Setup ⚡

After Setup

Access

Verify Setup

🛠️ Local Development

Backend (Python/FastAPI)

Frontend (React/TypeScript)

📦 Tech Stack

📖 API Usage

Upload a Resource

Search

Ask Questions

🐳 Docker & Troubleshooting

Common Commands

Port Conflicts

Manual API Testing

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages