Pentamind
🌊 Five Minds, One Answer — Intelligent Multi-Model AI Orchestration
Inspiration • What It Does • How We Built It • Challenges • Accomplishments • What We Learned • What's Next
The Problem
AI models are not created equal. Each excels at different tasks:
- Claude is exceptional at coding
- DeepSeek R1 dominates mathematical reasoning
- Gemini handles massive documents with ease
- Perplexity provides real-time web research
- Mistral excels at text refinement
But users shouldn't need to know this. They shouldn't have to:
- Switch between 5 different AI apps
- Guess which model is best for their task
- Pay for multiple subscriptions
- Copy-paste between tools
The result? Suboptimal AI responses, wasted time, and frustrated users.
Inspiration
We were frustrated by the fragmented AI landscape. Every day, we found ourselves:
- Opening ChatGPT for general questions
- Switching to Claude for coding help
- Using Perplexity for research with citations
- Going to Wolfram Alpha for math problems
- Using different tools for summarization
Why can't one intelligent system route our requests to the best model automatically?
That's when Pentamind was born — a unified AI assistant that thinks with five specialized minds but speaks with one voice.
The name "Pentamind" comes from "Penta" (five) + "Mind" — representing the five specialized AI models working together as one cohesive intelligence.
What It Does
Pentamind is a desktop overlay application that intelligently routes your requests to the optimal AI model based on your task:
🎯 Five Specialized Models for Five Task Types
| Task | Model | Why This Model |
|---|---|---|
| 📝 Summarize | Gemini 2.5 Pro | 2M token context window for massive documents |
| 🔍 Research | Perplexity AI | Real-time web search with citations |
| 🧮 Solve | DeepSeek R1 | State-of-the-art mathematical reasoning |
| 💻 Code | Claude Sonnet 4 | Best-in-class code generation |
| ✍️ Rewrite | Mistral Small | Excellent text refinement and editing |
✨ Key Features
- 🖥️ Always-On Overlay — Floating window stays on top, ready when you need it
- 📄 Document Upload — Drag & drop PDFs, DOCX, TXT files
- 🎤 Voice Input — Speak your requests naturally
- 🔄 Smart Routing — Automatic model selection based on task type
- 📊 LaTeX Rendering — Beautiful math equations with KaTeX
- ⚡ Fast & Local — Native Tauri app, no browser needed
- 🌊 Gradient AI Powered — Enterprise-grade AI infrastructure via DigitalOcean
💡 Example Use Cases
"Summarize this 50-page research paper" → Gemini 2.5 Pro
"Find the latest news about AI regulations" → Perplexity
"Solve this differential equation step by step" → DeepSeek R1
"Write a Python function for binary search" → Claude Sonnet 4
"Make this email more professional" → Mistral Small
How We Built It
Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ PENTAMIND DESKTOP APP │
│ (Tauri + React) │
├─────────────────────────────────────────────────────────────┤
│ 🖱️ Drag & Drop │ 🎤 Voice Input │ 📝 Text Input │
└────────────────────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ FASTAPI BACKEND │
│ (Python + LangGraph) │
├─────────────────────────────────────────────────────────────┤
│ 📋 Task Classification → 🎯 Model Selection → ✅ Execute │
└────────────────────────────┬────────────────────────────────┘
│
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌─────────────────┐ ┌────────────────┐
│ Gradient AI │ │ Google Cloud │ │ Perplexity │
│ Agentic │ │ (Gemini 2.5) │ │ Search API │
│ Cloud │ │ │ │ │
├───────────────┤ └─────────────────┘ └────────────────┘
│ • Claude │
│ • DeepSeek R1 │
│ • Mistral │
│ • Llama 3 │
└───────────────┘
Tech Stack
Frontend (Desktop App)
- Tauri v2 — Lightweight, secure native app framework
- React 19 — Modern UI with hooks
- TypeScript — Type-safe development
- Tailwind CSS v4 — Utility-first styling
- KaTeX — LaTeX math rendering
Backend (AI Orchestration)
- FastAPI — High-performance Python API
- LangGraph — Workflow orchestration for AI agents
- pypdf + python-docx — Document parsing
- OpenAI Whisper — Speech-to-text transcription
AI Models via Gradient AI Agentic Cloud
anthropic-claude-sonnet-4— Code generationdeepseek-r1-distill-llama-70b— Reasoning & mathmistral-small-3.1-24b-instruct— Text editingllama3.3-70b-instruct— Fallback & routingllama3-8b-instruct— Fast task classification
External APIs
- Gemini 2.5 Pro/Flash — Long-context summarization
- Perplexity AI — Real-time web research
Challenges We Ran Into
1. Tauri Window Dragging Issues
The transparent overlay window wouldn't drag properly after interacting with other apps. We solved this by implementing direct startDragging() API calls instead of relying on data-tauri-drag-region.
2. File Drag & Drop in WebView
Tauri's dragDropEnabled setting was intercepting browser drag events. Setting it to false allowed native HTML5 drag-and-drop to work correctly.
3. Model API Rate Limits
Different models have different rate limits and response times. We implemented intelligent fallback logic — if one model fails, the system automatically retries with an alternative.
4. LaTeX Rendering
AI models often return raw LaTeX (e.g., \frac{a}{b}). We integrated KaTeX to render beautiful math equations in the response view.
5. macOS Window Hiding
When hiding the overlay, clicking the dock icon wouldn't show it again. We had to handle the Reopen event in Rust to properly restore the window.
Accomplishments That We're Proud Of
🏆 Technical Achievements
- 5 AI models orchestrated seamlessly — Users never need to think about which model to use
- Sub-second task classification — Llama 3 8B classifies tasks in <500ms
- 2M token context support — Gemini 2.5 Pro handles massive documents
- Native desktop performance — 10MB app size, instant startup
🎨 User Experience
- One-click access — Always-on overlay, always ready
- Zero learning curve — Just type, speak, or drop a file
- Beautiful math rendering — LaTeX equations look professional
- Modern design — Clean, elegant, gradient-inspired theme
📊 By The Numbers
| Metric | Value |
|---|---|
| AI Models Integrated | 5+ |
| Supported File Types | 6 (PDF, DOCX, TXT, MD, JSON, CSV) |
| Max Context Length | 2,000,000 tokens |
| App Bundle Size | ~10 MB |
| API Response Time | <3 seconds average |
What We Learned
Technical Insights
- LangGraph is powerful — Orchestrating multi-model workflows becomes manageable with proper state machines
- Tauri v2 is production-ready — Native performance with web technologies, but requires understanding platform quirks
- Model specialization matters — Using the right model for the right task dramatically improves output quality
Product Insights
- Simplicity wins — Users don't want to think about which AI to use
- Context is king — Seamless document upload changes how people interact with AI
- Desktop > Web for assistants — Always-on overlays feel more integrated than browser tabs
Team Insights
- AI-assisted development is a multiplier — We built this entire project with AI pair programming
- Start with the UX — We designed the user flow before writing code
- Ship fast, iterate faster — The hackathon deadline forced focus on what matters
What's Next for Pentamind
🚀 Short-Term Roadmap
- [ ] Global keyboard shortcut — Summon Pentamind with
Cmd+Shift+P - [ ] Conversation history — Save and search past interactions
- [ ] Custom model preferences — Let users override default routing
- [ ] Clipboard integration — Automatically process copied text
🌟 Long-Term Vision
- [ ] Plugin system — Extend Pentamind with custom tools
- [ ] Team collaboration — Share prompts and responses
- [ ] Fine-tuned routing — Learn user preferences over time
- [ ] Mobile companion app — Sync across devices
- [ ] Enterprise deployment — Self-hosted option for organizations
💭 The Dream
We envision Pentamind as the universal AI interface — a single, intelligent assistant that:
- Knows when to search the web
- Knows when to analyze documents
- Knows when to write code
- Knows when to solve equations
- Knows when to refine text
One interface. Five minds. Infinite possibilities.
Quick Start
Prerequisites
- Node.js 18+
- Python 3.11+
- Rust (for Tauri)
- API Keys: DigitalOcean, Gemini, Perplexity
Installation
# Clone the repository
git clone https://github.com/bshiribaiev/pentamind.git
cd pentamind
# Backend setup
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Set environment variables
export MODEL_ACCESS_KEY="your-digitalocean-key"
export GEMINI_API_KEY="your-gemini-key"
export PERPLEXITY_API_KEY="your-perplexity-key"
# Start backend
./start.sh
# Frontend setup (new terminal)
cd frontend
npm install
npm run tauri dev
Team
Built with 💙 for the MLH x DigitalOcean Hackathon
Pentamind — Five Minds, One Answer
Built With
- css
- fastapi
- html
- langgraph
- python
- rust
- typescript
Log in or sign up for Devpost to join the conversation.