Serverless AWS karaoke platform — upload any song, get separated stems and synced lyrics.
- Upload an audio file (MP3, WAV, M4A, FLAC — max 10 min / 50MB)
- Separate into 4 stems (drums, bass, other, vocals) using Meta's Demucs
- Extract word-level synced lyrics using OpenAI's Whisper
- Serve stems + lyrics via presigned S3 URLs for playback with real-time mixing
Users toggle stems on/off: karaoke (vocals off), drum practice (drums off), bass practice (bass off), or isolate vocals.
| Layer | Technology |
|---|---|
| API | API Gateway (REST + WebSocket) |
| Compute | Lambda (Python 3.12) + ECS Fargate (Docker) |
| Orchestration | Step Functions |
| Auth | Cognito |
| Database | DynamoDB |
| Storage | S3 (presigned URLs) |
| ML | Demucs htdemucs_ft + Whisper base (v2: custom model via SageMaker) |
| IaC | SAM / CloudFormation (nested stacks) |
| Frontend | React / Next.js (planned) |
| Python | uv (3.12) |
Client --> API Gateway --> UploadRequest Lambda (presigned URL)
|
PUT to S3 Upload Bucket
|
ProcessUpload Lambda --> Step Functions
|
Demucs (Fargate) --> Whisper (Fargate)
|
Completion --> Cleanup --> Notify
|
S3 Output Bucket (stems + lyrics)
unplugd/
├── template.yaml # Root SAM template (orchestrates nested stacks)
├── templates/ # Nested CloudFormation templates
│ ├── api.yaml # REST API Gateway + routes
│ ├── auth.yaml # Cognito user pool & clients
│ ├── ecs.yaml # ECR + ECS Fargate task definitions
│ ├── monitoring.yaml # SQS DLQ + CloudWatch alarms
│ ├── storage.yaml # DynamoDB tables + S3 output bucket
│ ├── vpc.yaml # VPC, subnets, IGW, security groups
│ └── websocket.yaml # WebSocket API Gateway
├── functions/ # Lambda handlers
│ ├── shared/ # Shared utilities (Lambda layer)
│ ├── upload_request/ # POST /songs/upload-url
│ ├── process_upload/ # S3 trigger → validate → Step Functions
│ ├── ws_connect/ # WebSocket $connect (JWT auth)
│ ├── ws_disconnect/ # WebSocket $disconnect
│ ├── ws_default/ # WebSocket $default (ping/pong)
│ ├── send_progress/ # Push progress via WebSocket
│ ├── list_songs/ # GET /songs
│ ├── get_song/ # GET /songs/{songId}
│ ├── delete_song/ # DELETE /songs/{songId}
│ ├── get_presets/ # GET /presets
│ ├── completion/ # Step Functions: mark COMPLETED
│ ├── cleanup/ # Step Functions: delete upload
│ ├── notify/ # Step Functions: notify completion
│ └── failure_handler/ # Step Functions: mark FAILED
├── containers/ # Fargate Docker images
│ ├── shared/ # Shared progress reporting
│ ├── demucs/ # Source separation
│ └── whisper/ # Lyrics extraction
├── layers/common/ # Lambda layer (symlink → functions/shared)
├── statemachines/ # Step Functions ASL definitions
├── tests/ # 92 unit tests across 16 files
├── docs/
│ └── PROJECT.md # Full spec (API, schemas, formats, costs)
│ └── PROJECT_PLAN.md # Implementation roadmap with phase details and status
└── _reference/ # Legacy U-Net codebase (archived)
- Python 3.12
- uv — Python package manager
- AWS SAM CLI
- AWS account with credentials configured (
~/.aws/credentials) - Docker (for Fargate container builds)
# Install dependencies
uv sync
# Validate SAM templates
sam validate
# Build
sam build
# Deploy to dev
sam deploy --config-env dev
# Deploy to prod
sam deploy --config-env prod# Run unit tests (92 tests)
uv run pytest tests/unit/ -v
# Lint
uv run ruff check .
# Format check
uv run ruff format --check .
# Type check
uv run mypy functions/shared/
# Local Lambda testing
sam local invoke UploadRequestFunction -e events/upload_request.json
# Build Fargate containers
docker build --provenance=false --platform linux/amd64 \
-f containers/demucs/Dockerfile -t unplugd-demucs .
docker build --provenance=false --platform linux/amd64 \
-f containers/whisper/Dockerfile -t unplugd-whisper .Phases 0-5 complete — 92 unit tests passing.
| Phase | Description | Status |
|---|---|---|
| 0-4 | Scaffold, Storage/Auth, Upload API, WebSocket, Demucs Container | Done |
| 4.5 | Code audit refactoring | Done |
| 5 | Whisper Container (lyrics) | Done |
| 6 | Step Functions Orchestration | Pending |
| 7 | Song Library API | Pending |
| 8 | Web Frontend (React/Next.js) | Pending |
| 9-12 | CI/CD, SageMaker ML, Model Integration, Hardening | Pending |
- docs/PROJECT.md — Full specification (API reference, DB schemas, lyrics format, mixing presets, cost estimates)
- docs/PROJECT_PLAN.md — Implementation roadmap with phase details and status
MIT