Send live captions to YouTube live streams using Google's official closed caption ingestion url (HTTP POST).
To use this package, you first need to:
- Set up a live in Youtube, set it to 30 second delay and to receive captions via HTTP POST requests.
- Start the live and then send the captions with the stream key (and in correct sequence).
This is a monorepo containing the following packages:
| Package | Platform | Installation | Description |
|---|---|---|---|
lcyt |
npm | npm install lcyt |
Core Node.js library |
lcyt-cli |
npm | npm install -g lcyt-cli |
CLI tool |
lcyt-backend |
— | — | Node.js Express backend relay |
lcyt-mcp-stdio |
— | node packages/lcyt-mcp-stdio/src/server.js |
MCP server (stdio transport) |
lcyt-mcp-sse |
— | node packages/lcyt-mcp-sse/src/server.js |
MCP server (HTTP SSE transport) |
lcyt |
PyPI | pip install lcyt |
Core Python library |
lcyt-backend |
pip | pip install lcyt-backend |
Python/Flask backend relay (cPanel/Passenger) |
There are four conceptual levels in this package:
- CLI tool for simple use and core library for development
- MCP servers (local stdio and HTTP SSE) for the latest AI frenzy
- Backend and web tool for production use
- The project site with docs and a managed version of backend and web tool
The lcyt npm package no longer includes the CLI. The CLI has been moved to a separate lcyt-cli package. There is currently some bugs in the -f version of lcyt-cli. Please use -i.
npx lcyt-cli # Launch full-screen mode
npx lcyt-cli --stream-key "YOUR_KEY" # Set stream key
npx lcyt-cli "Hello, world!" # Send a caption
npx lcyt-cli -i # Interactive mode
npx lcyt-cli -f # Full-screen mode
npx lcyt-cli --heartbeat # Test connectionSee the lcyt-cli package for full CLI documentation.
When running lcyt-cli under MCP or other protocol parsers, human-friendly log lines must not appear on stdout (they can break parsers). Use the --log-stderr flag or the LCYT_LOG_STDERR=1 environment variable to route human logs to stderr while keeping protocol messages on stdout:
# using flag
npx lcyt-cli --heartbeat --log-stderr --stream-key YOUR_KEY
# using environment variable
LCYT_LOG_STDERR=1 npx lcyt-cli --heartbeat --stream-key YOUR_KEYBoth ESM and CJS are provided.
npm install lcytimport { YoutubeLiveCaptionSender } from 'lcyt';
const sender = new YoutubeLiveCaptionSender({
streamKey: 'YOUR_STREAM_KEY'
});
sender.start();
await sender.send('Hello, world!');
sender.end();See the lcyt package for full API documentation.
pip install lcytfrom lcyt import YoutubeLiveCaptionSender
sender = YoutubeLiveCaptionSender(stream_key="YOUR_STREAM_KEY")
sender.start()
sender.send("Hello, world!")
sender.end()See the Python package documentation for full API reference.
This project uses npm workspaces. To get started:
npm install # Install all dependencies
npm test # Run tests across all packages
npm run build # Build CJS output for core library- Set a stable
JWT_SECRET: for production, setJWT_SECRETin your environment so session tokens remain valid across restarts. Example indocker-compose.yml:
environment:
- JWT_SECRET=replace-with-a-secure-random-value- DB volume ownership: the SQLite file is stored in a named Docker volume (e.g.
live-captions-yt_lcyt-db). If you seeSqliteError: attempt to write a readonly database, ensure the volume is owned by the runtime user (typicalnodeUID 1000). One-off fix:
# on the host (alpine image used for chown)
docker run --rm -v live-captions-yt_lcyt-db:/data alpine chown -R 1000:1000 /data-
MCP SSE network exposure: the MCP SSE service is sensitive — bind it to loopback on the host and reverse-proxy from nginx if you need external access. The included
docker-compose.ymlbinds port 3001 to127.0.0.1:3001:3001by default so it's not externally reachable. If you expose it, ensure you use a secure reverse proxy and firewall rules. -
Reconnection behavior: when the backend restarts, sessions persisted in SQLite are rehydrated without an active sender. When a client POSTs
/liveto re-register, the server will issue a fresh JWT for the rehydrated session so the client can obtain a usable token and open the SSE stream. -
Optional: persist tokens: if you prefer tokens to survive restarts without client re-registration, modify the server to persist issued tokens and ensure
JWT_SECRETis stable. The current default behaviour is to re-issue tokens on re-register.
LCYT implements Google's official YouTube Live caption format:
- Method: POST
- Content-Type:
text/plain - URL params:
cid=<stream_key>&seq=N
Note: Do not to try to add charset in the Content-Type!
Timestamps use the format:
YYYY-MM-DDTHH:MM:SS.mmm
- No trailing
Z, no UTC offset — millisecond precision - Example:
2024-01-15T12:00:00.000 - Must be within 60 seconds of the server's current time
The libraries accept multiple timestamp forms — see the lcyt package docs (Node.js) and Python package docs for the full list including Date/datetime objects, epoch numbers, and relative second offsets.
A region/cue identifier may follow the timestamp on the same line. It is optional. It's format is (region:reg1#cue1). The effects of the regions and cues is not well documented and has not been tested. Some indication has been given that cue means possible places for advertisement breaks.
YYYY-MM-DDTHH:MM:SS.mmm region:reg1#cue1
CAPTION TEXT
YYYY-MM-DDTHH:MM:SS.mmm
ANOTHER CAPTION
Important Requirements:
- Timestamps must be within 60 seconds of the server's current time
- Body must end with a trailing newline (
\n)- Region/cue identifier after timestamp is optional .
Note on numeric epoch values: The Node.js library treats numbers >= 1000 as milliseconds (
Date.now()convention); the Python library treats them as seconds (time.time()convention).
To get your YouTube Live caption ingestion URL and key:
- Go to YouTube Studio
- Click Create → Schelude a broadcast
- Set up your stream settings
- Set a 30 second delay for the broadcast (important!)
- Look for Closed captions in settings, enable it
- Enable POST captions to URL for closed captions
- Copy the ingestion URL (usually stable, default in the library) and stream key
YouTube already has English transcription for live videos, why bother? Well, there are other languages as well, and you might want to have another model do the transcription! This project was initially founded to serve as accessibility feature for the (Evangelical Lutheran Church of Finland)[evl.fi], but was created from the beginning as a general tool for anyone to use.
You are welcome to contribute by opening issues and contributing code. Just fork and do a pull request when your feature is ready.
Juha Itäleino (@jsilvanus), [email protected]
MIT