Skip to content

fix(ollama): hide native reasoning-only output#45330

Merged
frankekn merged 3 commits intomainfrom
frank/fix-ollama-hide-native-reasoning
Mar 13, 2026
Merged

fix(ollama): hide native reasoning-only output#45330
frankekn merged 3 commits intomainfrom
frank/fix-ollama-hide-native-reasoning

Conversation

@frankekn
Copy link
Contributor

Description

Hide Ollama native thinking / reasoning output from final assistant replies by removing the remaining streaming fallback path as well.

This replacement continues the work from #45317 and fixes the part that was still leaking native reasoning through the streaming aggregation path.

Original contributor: @xi7ang
Supersedes: #45317
Fixes: #45169

Root Cause

buildAssistantMessage() was changed to ignore thinking / reasoning, but createOllamaStreamFn() still accumulated those fields into fallbackContent and copied that fallback into finalResponse.message.content before building the final assistant message.

Changes

  • keep final Ollama assistant text sourced only from message.content
  • remove the streaming fallbackContent path that promoted thinking / reasoning into final text
  • update ollama-stream tests to assert that reasoning-only output stays hidden

Testing

  • pnpm build
  • pnpm check
  • pnpm test

@openclaw-barnacle openclaw-barnacle bot added the agents Agent runtime and tooling label Mar 13, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This PR completes the fix for native Ollama reasoning leakage by removing the fallbackContent streaming path in createOllamaStreamFn() that still promoted thinking/reasoning chunks into the final assistant reply when message.content was empty. After the change, only message.content is ever used as the visible assistant text; reasoning fields are silently dropped at both the buildAssistantMessage layer and the streaming accumulation layer.

Key changes:

  • buildAssistantMessage() — fallback chain content || thinking || reasoning || "" replaced with content || ""
  • createOllamaStreamFn()fallbackContent accumulator and sawContent flag removed; finalResponse.message.content is set to accumulatedContent (never fallbackContent)
  • Tests — "falls back to …" expectations updated to content: []; test descriptions accurately describe the new "drops … when no content" semantics

The implementation is minimal, well-scoped, and fully covered by the updated test suite.

Confidence Score: 5/5

  • This PR is safe to merge — it removes a narrow, well-understood fallback code path and backs the change with targeted tests.
  • The change is small and focused: two code sites touched, clear root-cause description, and every affected behaviour has a corresponding updated or new test. No new dependencies, no interface changes, and the logic correctly handles all known streaming combinations (thinking-only, reasoning-only, mixed thinking+content, content following reasoning).
  • No files require special attention.

Last reviewed commit: af15e4f

@frankekn frankekn merged commit 7778627 into main Mar 13, 2026
30 checks passed
@frankekn frankekn deleted the frank/fix-ollama-hide-native-reasoning branch March 13, 2026 17:38
@frankekn
Copy link
Contributor Author

Merged.

Landed commit: 7778627
Source head: af15e4f
Original fix commit carried forward: 9504281

This landed via replacement PR flow to preserve attribution while fixing the remaining Ollama streaming-path leak and updating the tests/changelog.

z-hao-wang pushed a commit to z-hao-wang/openclaw that referenced this pull request Mar 13, 2026
frankekn added a commit to xinhuagu/openclaw that referenced this pull request Mar 14, 2026
caicongyang pushed a commit to caicongyang/openclaw that referenced this pull request Mar 14, 2026
… messages

When using reasoning models (KiloCode, NVIDIA Nemotron, etc.), Telegram
was sending duplicate messages because thinking/reasoning content was being
used as fallback response content.

This fix imports stripThinkingTagsFromText and applies it to the fallback
logic in handleMessageEnd, ensuring reasoning tokens don't appear in the
final response. This generalizes the Ollama fix (openclaw#45330) to work for
ALL reasoning models.

Also fixes issue openclaw#45955: Clear session runtime model on config changes
so UI model switching takes effect immediately without requiring full
gateway restart.

Fixes: openclaw#45965
Fixes: openclaw#45955
ecochran76 pushed a commit to ecochran76/openclaw that referenced this pull request Mar 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: the problem with the local LLM

1 participant