Telegram messages flow through Redis lists and Hatchet workflows: a receiver enqueues inbound jobs, a consumer starts the chat workflow, and a worker runs ai-chat-workflow (memory + LLM reply + outbound enqueue) plus memory-summarize-workflow every N completed turns when Airtable is configured.
Short-term memory lives in Redis (up to 10 user/assistant turns per Telegram user). Long-term memory is a rolling summary in Airtable (one row per TelegramID), with a Redis cache for reads on the hot path so normal replies do not hit Airtable. Optional OpenRouter powers chat and summarization when OPEN_ROUTER_API_KEY is set.
- Python 3.11+
- Redis (
REDIS_URL) - Hatchet (e.g. local stack via
docker-compose.yml) andHATCHET_CLIENT_TOKEN - For Telegram:
TELEGRAM_API_ID,TELEGRAM_API_HASH, separate session files for receiver (TELEGRAM_SESSION, defaultproducer_session) and sender (TELEGRAM_SENDER_SESSION, defaultsender_session) — seesrc/telegram/readme.mdand WHITELIST_USER below - Optional:
AIRTABLE_PAT,AIRTABLE_BASE_ID, and base table fields as insrc/core/airtable.py - Optional:
OPEN_ROUTER_API_KEY,OPENROUTER_CHAT_MODEL,OPENROUTER_SUMMARY_MODEL(seesrc/service/readme.md)
Install the package from the repo root so import src... resolves (setuptools layout):
uv sync
# or: pip install -e .Use separate terminals for each long-running process:
uv run python script.py --help
uv run python script.py receiver # Telethon → inbound Redis queue
uv run python script.py sender # outbound Redis queue → Telethon send_message
uv run python script.py consumer # inbound queue → trigger Hatchet AIChatWorkflow
uv run python script.py worker # executes ai-chat-workflow + memory-summarize-workflowCopy .env from your secrets template and fill values before starting.
Edit src/telegram/whilelist.py and set the WHITELIST_USER list to the numeric Telegram user ids that may use the bot (strings or ints are fine, e.g. "123456789" or 123456789).
| Configuration | Behavior |
|---|---|
| No valid ids (empty list, blank strings only, or invalid entries) | Whitelist is off: every user is accepted. A warning is logged at startup. |
| One or more valid ids | Receiver does not enqueue inbound messages from anyone else. Sender drops outbound jobs whose user_id or chat_id is not listed. |
Restart receiver, sender, and worker after changing the list so inbound filtering and outbound payloads stay consistent.
If you use the bundled Compose stack, the Hatchet dashboard is described in docker-compose.yml (default login is often [email protected] / Admin123!! — confirm against your running instance).
With Redis up, you can enqueue a fake inbound payload:
uv run python test.pyModule-level docs live under src/*/readme.md.
Core Functionalities
-
Asynchronous, Decoupled Architecture
- Separates Telegram input/output from processing using Redis queues and Hatchet workflows (
receiver,consumer,worker,sender), allowing the system to handle spikes without blocking.
- Separates Telegram input/output from processing using Redis queues and Hatchet workflows (
-
Two-Tier Memory Strategy
- Short-term context: Redis maintains a per-user 10-turn FIFO rolling window.
- Long-term memory: Airtable stores a durable row for each user.
-
Batch Summarization for Scalability
- Every N turns (per user), a background workflow summarizes the last 10 turns plus the prior summary into a concise (200–300 word) replacement summary. This prevents unbounded memory growth.
-
Cache-Aside Long-Term Reads
- The primary chat path reads only from fast Redis storage.
- Airtable updates occur out-of-band; upon write, the corresponding Redis long-term key is refreshed.
-
Provider-Agnostic LLM Integration
- Supports OpenRouter-based chat and summarization with configurable models, robust retry and backoff strategies, and flexible parameter handling for different providers.
-
Safety and Access Control
- Uses whitelist gating for both inbound (received) and outbound (sent) messages, restricting bot access and preventing unintended responses.
Assumptions (per user, per 10 messages):
- User message ≈ 50 input tokens
- Assistant reply ≈ 100 output tokens
- Short-term window = last 10 user messages ≈ 500 input tokens
- Long-term summary ≈ 300 input tokens
- Summarization runs once per 10 turns and produces a 300 token summary
| Action (per user, per 10 messages) | Input Token Use | Output Token Use |
|---|---|---|
| Chat replies (10×): prompt = user msg (50) + cached long-term (300) + short-term window (500) | (10 \times (50 + 300 + 500) = 8{,}500) | (10 \times 100 = 1{,}000) |
| Batch summarization (1×): prior summary (300) + last 10 turns transcript (500 user + 1,000 assistant) | (300 + (500 + 1{,}000) = 1{,}800) | (300) |
| Total | 10,300 | 1,300 |