A real-time voice AI that lets you have a conversation with Winston Churchill about his writings. Ask him anything about painting, creativity, or the life of a statesman — he'll answer in his own words, in his own voice.
You open a web app, pick one of Churchill's essays, click a button, and start talking. Churchill responds out loud, in character, with his characteristic wit and eloquence. The conversation is fully voice-driven — no typing involved.
Under the hood, your voice goes through a three-stage AI pipeline:
Your voice → Deepgram (transcription) → Gemini (Churchill's brain) → ElevenLabs (Churchill's voice) → Your speakers
- Docker Desktop
- API keys for:
- Deepgram — speech-to-text
- Google Gemini — language model
- ElevenLabs — text-to-speech + a cloned Churchill voice ID
- ngrok (optional, for sharing the URL externally)
git clone <repo-url>
cd churchill-ai
cp apps/backend/.env.example apps/backend/.envEdit apps/backend/.env and fill in your API keys:
DEEPGRAM_API_KEY=...
GEMINI_API_KEY=...
ELEVENLABS_API_KEY=...
ELEVENLABS_VOICE_ID=...docker compose -f infrastructure/docker-compose.yml up --buildOpen http://localhost:3000 — that's it.
If you want to give someone else access to your local instance:
./share.shThis opens two ngrok tunnels (frontend + backend) and prints a single shareable URL:
https://abc123.ngrok-free.app?ws=wss://xyz456.ngrok-free.app/ws/voice
Anyone with that link can use the app from their browser. The ?ws= param tells the frontend which backend to connect to.
Requirements: ngrok installed and authenticated (
ngrok authtoken <your-token>)
If you need the app live for a demo without being at your keyboard:
caffeinate -is # prevents sleep even with lid closed (requires power + external display)Or just lock your screen with Cmd + Ctrl + Q — processes keep running when the screen is locked, not when you log out.
churchill-ai/
├── apps/
│ ├── frontend/ # Next.js 15 web app
│ │ └── src/
│ │ ├── app/
│ │ │ └── page.tsx # Article selector
│ │ └── components/
│ │ └── VoiceAssistant.tsx # WebSocket client + audio I/O
│ └── backend/ # FastAPI server
│ └── src/
│ ├── main.py # WebSocket endpoint /ws/voice
│ ├── articles/ # Article content
│ ├── config/
│ │ ├── prompts.py # Churchill system prompt
│ │ └── settings.py # Env vars + defaults
│ └── services/
│ └── assistant_factory.py # Wires up the AI pipeline
├── packages/
│ └── ccai/ # Internal voice assistant framework
│ └── ccai/core/
│ ├── speech_to_text/ # Deepgram STT
│ ├── llm/ # Gemini LLM
│ ├── text_to_speech/ # ElevenLabs TTS
│ ├── brain/ # Orchestrates STT → LLM → TTS
│ └── voice_assistant/ # Top-level assistant logic
├── infrastructure/
│ └── docker-compose.yml
├── share.sh # ngrok tunnel script
└── ARCHITECTURE.md # Diagrams and technical deep-dive
Each conversation turn follows this flow:
- Browser captures microphone audio at 16kHz and streams it as base64 JSON over WebSocket to the backend
- Deepgram receives the raw PCM audio and returns a transcript in real time, using 200ms silence detection to detect when you stop speaking
- Gemini 2.5 Flash Lite receives the transcript plus the full conversation history and a system prompt that instructs it to respond as Churchill, grounded in the selected article
- ElevenLabs converts Gemini's response to PCM audio using a Churchill voice clone, streaming chunks back as they're generated
- Browser receives the audio chunks and plays them back seamlessly via the Web Audio API
The entire round-trip is streamed end-to-end — the response starts playing before Gemini has finished generating it.
- Create a new file in
apps/backend/src/articles/:
# apps/backend/src/articles/my_article.py
MY_ARTICLE = {
"id": "my-article-slug",
"title": "My Article Title",
"content": "Full text of the article...",
}-
Register it in
apps/backend/src/articles/__init__.py -
Add it to the
ARTICLESarray inapps/frontend/src/app/page.tsx
| Layer | Technology |
|---|---|
| Frontend | Next.js 15, React 18, TypeScript, Tailwind CSS |
| Backend | FastAPI, Python 3.11, uvicorn |
| STT | Deepgram (streaming, 16kHz PCM) |
| LLM | Google Gemini 2.5 Flash Lite |
| TTS | ElevenLabs (streaming PCM) |
| Infrastructure | Docker Compose |
| Tunneling | ngrok |
For diagrams and a deeper technical breakdown, see ARCHITECTURE.md.