fix(agent): isolate last-turn total in token usage reporting#18052
fix(agent): isolate last-turn total in token usage reporting#18052steipete merged 1 commit intoopenclaw:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug where session_status incorrectly reports 100% context usage after multi-turn tool-use runs with the embedded provider flow. The root cause is that recordAssistantUsage accumulates cacheRead across all API turns, causing the calculated total to exceed contextTokens and get clamped to the maximum value (e.g., 200,000 tokens), even when actual usage is much lower.
Changes:
- Capture
lastTurnTotalfrom the most recent model call response inrunEmbeddedPiAgent - Replace accumulated
usage.totalwithlastTurnTotalbefore passing to downstream consumers - Add comprehensive test verifying correct token total reporting in multi-turn scenarios
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/agents/pi-embedded-runner/usage-reporting.test.ts |
New test file verifying that usage.total reflects current turn total (200) instead of accumulated total (350) in multi-turn scenarios |
src/agents/pi-embedded-runner/run.ts |
Captures lastTurnTotal from last assistant/attempt usage (line 497) and replaces usage.total with it after normalization (lines 897-899) to fix context utilization reporting |
| @@ -893,6 +894,9 @@ export async function runEmbeddedPiAgent( | |||
| } | |||
|
|
|||
| const usage = toNormalizedUsage(usageAccumulator); | |||
There was a problem hiding this comment.
Consider adding a brief comment explaining why usage.total is being replaced with lastTurnTotal. This would help future maintainers understand that the accumulated total from toNormalizedUsage includes multi-turn cache reads and needs to be replaced with the actual last-turn total to avoid incorrect 100% context utilization reporting. The existing comment at lines 494-496 explains the related concept for lastRunPromptUsage, but this specific replacement could benefit from its own explanation.
| const usage = toNormalizedUsage(usageAccumulator); | |
| const usage = toNormalizedUsage(usageAccumulator); | |
| // `toNormalizedUsage` accumulates totals across multiple turns and cache reads, | |
| // which can overstate the last turn's context size (and show 100% utilization). | |
| // Replace the accumulated `usage.total` with the actual last-turn total so that | |
| // context-window stats reflect only the final call rather than all prior calls. |
…w#17016) recordAssistantUsage accumulated cacheRead across the entire multi-turn run, and totalTokens was clamped to contextTokens. This caused session_status to report 100% context usage regardless of actual load. Changes: - run.ts: capture lastTurnTotal from the most recent model call and inject it into the normalized usage before it reaches agentMeta. - usage-reporting.test.ts: verify usage.total reflects current turn, not accumulated total. Fixes openclaw#17016
Summary
Fix
session_statusincorrectly reporting 100% context usage by isolating the current turn total from accumulated multi-turn usage.Fixes #17016
Root Cause
recordAssistantUsageaccumulatescacheReadacross the entire multi-turn run.toNormalizedUsagethen produces anusage.totalthat reflects all accumulated tokens, which gets clamped tocontextTokens. Result: every session appears at 100% context utilization.Fix
In
runEmbeddedPiAgent(src/agents/pi-embedded-runner/run.ts):lastTurnTotalfrom the most recent model call responsetoNormalizedUsage, replaceusage.totalwithlastTurnTotalso downstream consumers (status, UI) see the actual current-turn token countTests
usage-reporting.test.ts: verifiesusage.totalreflects current turn total, not accumulated.Sign-Off
Greptile Summary
Fixes incorrect 100% context utilization reporting in
session_statusby isolating the last-turn token total from accumulated multi-turn usage. The fix captureslastTurnTotalfrom the most recent model call and overridesusage.total(which was inflated by accumulatedcacheRead/output tokens) aftertoNormalizedUsagenormalization.run.ts: CaptureslastTurnTotalfromlastAssistantUsage?.total ?? attemptUsage?.total(line 497), then replacesusage.totalwith it after normalization (lines 897-899). This complements the existinglastCallUsagemechanism that downstream consumers (persistSessionUsageUpdate) already prefer for context-window calculations.usage-reporting.test.ts: New test verifyingusage.totalreflects the last turn's total (200) rather than the accumulated total (350) in a simulated multi-turn scenario. Reuses shared mocks fromrun.overflow-compaction.mocks.shared.js.Confidence Score: 4/5
lastCallUsagemechanism already provides the primary defense for context-window display, making this a defense-in-depth improvement. No risk of breaking existing functionality sinceusage.totalis only overridden whenlastTurnTotalis a positive number.run.tschange is minimal and well-guarded.Last reviewed commit: 1fa49bc