Shadow Stories is a real-time shadow-puppet storyteller.
It watches a hand-shadow animal on camera, classifies the animal + motion, generates a short story continuation with Gemini, then speaks it with ElevenLabs. Local ambient audio and animal SFX are included.
- Detects shadow silhouettes from webcam frames.
- Classifies supported animals (for example:
snail,panther,moose). - Detects simple motion states (
still,walking ...,jumping). - Feeds that context into a constrained narration prompt.
- Streams spoken narration with low-latency TTS.
- Python
3.11+ - Webcam
ffplayon yourPATH(from ffmpeg)- Model weights file:
HSPR_ConvNextLarge_Aug_CB.pt- Source model repo:
https://github.com/Starscream-11813/HaSPeR
- Source model repo:
- API keys:
- Gemini (
GEMINI_API_KEY) - ElevenLabs (
ELEVENLABS_API_KEY)
- Gemini (
- Clone and enter the repo.
- Create env file:
cp .env.example .env- Fill required keys in
.env:
GEMINI_API_KEY=...
ELEVENLABS_API_KEY=...- Make model weights available (choose one):
- Place file at
./models/HSPR_ConvNextLarge_Aug_CB.pt - or set
SHADOW_MODEL_PATH=/absolute/path/to/HSPR_ConvNextLarge_Aug_CB.pt - Model source:
https://github.com/Starscream-11813/HaSPeR
- Install dependencies.
With uv:
uv syncWith pip:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtMain live loop (camera -> narration -> speech):
uv run python -m shadow_stories.live_stream_testAdd microphone input each turn:
uv run python -m shadow_stories.live_stream_test --interactive-voiceShow debug camera overlay window:
uv run python -m shadow_stories.live_stream_test --debugTune response cadence:
uv run python -m shadow_stories.live_stream_test --inference-interval 1.5 --min-obs 5Stop:
- Terminal loop:
Ctrl+C - Debug camera window:
q(also works via loop shutdown)
For quick prompt testing without live camera loop:
uv run shadow-narrate --voice "the dragon roars" --shadow "wings spread wide"Environment variables used by the app:
GEMINI_API_KEY(required)GEMINI_MODEL(default:gemini-2.5-flash-lite)VOICE_STT_MODEL(default:gemini-2.5-flash)ELEVENLABS_API_KEY(required for speech)ELEVENLABS_VOICE_ID(default in code:0lp4RIz96WD1RUtvEu3Q)SHADOW_MODEL_PATH(optional model path override)SHADOW_CAMERA_INDEX(default:1)
Runtime knobs (in shadow_stories/live_stream_test.py):
--inference-interval--min-obs
- Camera not found:
- Set
SHADOW_CAMERA_INDEXin.env(try0, then1).
- Set
- No speech output:
- Check
ELEVENLABS_API_KEY. - Confirm
ffplayis installed and onPATH.
- Check
- Model file error:
- Verify
SHADOW_MODEL_PATHor place weights under./models/.
- Verify
- Very noisy character switching:
- Increase
--min-obsand/or--inference-interval.
- Increase
uv run pytest -q- Logs are written to
logs/. - Secrets and local artifacts are gitignored (
.env, model files, caches, logs). - Do not commit API keys.
