Yes — there’s a clear path to reduce this a lot while keeping the same UX.
Based on your code + latest Convex docs:
singleGenerationProcessor.processGenerationandbatchProcessor.processBatchItemare both Node actions ("use node"), and Node actions are billed at 512MB runtime memory (limits).- Both actions spend significant wall-time waiting on external work (Pollinations + retries), and in
fetchWithRetryyou currently sleep inside the action between retries (convex/lib/retry.ts). singleGeneration.dispatchGenerationwrapsprocessGenerationviactx.runAction(...)(convex/singleGeneration.ts), which Convex docs call out as overhead-heavy unless crossing runtimes (actions best practices).- Batch scheduling is aggressively pipelined (
100ms + jitter) inconvex/batchGeneration.ts, which can increase 429/5xx retry pressure and burn more GBh.
-
Move retry backoff out of running actions
- Instead of
sleep(...)infetchWithRetry, schedule the next retry attempt viactx.scheduler.runAfter(...)and exit. - Keeps persistence, keeps reactive UX, cuts billed “idle waiting” time.
- Instead of
-
Add hard fetch timeout + retry by reschedule
- Abort slow provider calls (e.g. 45-90s depending image/video), mark attempt, reschedule.
- Prevents long-tail stuck calls from eating GBh.
-
Adaptive batch throttling
- Dynamically increase inter-item delay when 429/5xx spike, decrease when healthy.
- Usually lowers retries and total GBh while keeping throughput stable.
-
Remove extra action wrapper overhead in single flow
- Current: client ->
dispatchGenerationaction ->runAction(processGeneration). - Better: reduce one layer (or switch back to mutation-scheduled processor if acceptable).
- This is a smaller win than #1/#2 but still worthwhile.
- Current: client ->
-
Stop burning real generation compute in dev by default
- Your
devusage is huge too. Add a dev-only mock/placeholder path unless explicitly enabled.
- Your
- Mutation that records intent + schedules background work is still the canonical pattern (scheduled functions, actions).
- Free tier has tight scheduled concurrency; if you need controlled parallelism/retries,
@convex-dev/workpoolis now a strong fit (workpool README).
- Phase 1 (fast, biggest ROI): #1 + #2 + #5
- Phase 2: #3
- Phase 3: #4 / workpool migration if needed
If you want, I can implement Phase 1 now directly in your codebase (retry rescheduling + timeout + dev guard) and keep your current UX contract intact.