fix(agents): drop Anthropic thinking blocks on replay by frankekn · Pull Request #44843 · openclaw/openclaw

frankekn · 2026-03-13T07:51:10Z

Summary

replacement for closed fix(agents): drop Anthropic thinking blocks on replay #44429
extend replay-time thinking block sanitization to native anthropic and amazon-bedrock Claude providers
update provider capability expectations and add Bedrock coverage

Credit

original implementation and primary fix logic by @jmcte
this replacement PR keeps the original author commit and adds maintainer follow-up test/changelog prep only

Testing

pnpm build
pnpm check
pnpm test

Fixes #44387

greptile-apps · 2026-03-13T07:59:27Z

Greptile Summary

This PR extends the existing thinking-block-drop mechanism (previously only for github-copilot) to native anthropic and amazon-bedrock Claude providers, fixing a bug where persisted thinking blocks in session history caused API rejections on follow-up turns. The change is minimal and well-targeted: two lines added to PROVIDER_CAPABILITIES, a comment update in two files, and corresponding test updates.

provider-capabilities.ts: Adds dropThinkingBlockModelHints: ["claude"] to anthropic and amazon-bedrock entries, consistent with the existing github-copilot entry.
transcript-policy.ts / attempt.ts: Comment-only updates to broaden the described scope from "Copilot/Claude" to "Anthropic Claude endpoints."
provider-capabilities.test.ts: Adds snapshot assertions for anthropic and amazon-bedrock capability objects verifying the new hints are present.
pi-embedded-runner.sanitize-session-history.test.ts: Replaces the previous negative test (asserting thinking blocks are not dropped for anthropic) with a positive test confirming they are dropped, using a new sanitizeAnthropicHistory helper. The helper supports a provider override but no test exercises amazon-bedrock at the session-sanitization level — a minor coverage gap given the PR claims Bedrock coverage. The existing shouldDropThinkingBlocksForModel assertion in provider-capabilities.test.ts is also only checked for github-copilot, not the newly updated providers.

Confidence Score: 4/5

This PR is safe to merge — the implementation is correct and consistent with the existing pattern, with only minor test coverage gaps.
The core logic change (adding dropThinkingBlockModelHints: ["claude"] to anthropic and amazon-bedrock providers) is straightforward, minimal, and mirrors the existing github-copilot entry exactly. No behavioral regressions were found. The score is 4 rather than 5 due to the absence of an amazon-bedrock end-to-end sanitization test and the lack of direct shouldDropThinkingBlocksForModel assertions for the newly covered providers — both of which the PR description implies are present.
src/agents/pi-embedded-runner.sanitize-session-history.test.ts — missing a Bedrock-provider test case to validate the claimed coverage.

Comments Outside Diff (1)

src/agents/provider-capabilities.test.ts, line 94-103 (link)

No direct shouldDropThinkingBlocksForModel assertion for new providers

The existing test only calls shouldDropThinkingBlocksForModel for "github-copilot". Now that "anthropic" and "amazon-bedrock" both set dropThinkingBlockModelHints: ["claude"], direct behavioral assertions would make the intent explicit and guard against future regressions:

expect(
  shouldDropThinkingBlocksForModel({
    provider: "anthropic",
    modelId: "claude-opus-4-6",
  }),
).toBe(true);
expect(
  shouldDropThinkingBlocksForModel({
    provider: "amazon-bedrock",
    modelId: "anthropic.claude-3-5-sonnet-20241022-v2:0",
  }),
).toBe(true);

The indirect coverage via the resolveProviderCapabilities snapshot checks dropThinkingBlockModelHints is present, but a direct behavioral test is clearer.

Prompt To Fix With AI

This is a comment left during a code review.
Path: src/agents/provider-capabilities.test.ts
Line: 94-103

Comment:
**No direct `shouldDropThinkingBlocksForModel` assertion for new providers**

The existing test only calls `shouldDropThinkingBlocksForModel` for `"github-copilot"`. Now that `"anthropic"` and `"amazon-bedrock"` both set `dropThinkingBlockModelHints: ["claude"]`, direct behavioral assertions would make the intent explicit and guard against future regressions:

```typescript
expect(
  shouldDropThinkingBlocksForModel({
    provider: "anthropic",
    modelId: "claude-opus-4-6",
  }),
).toBe(true);
expect(
  shouldDropThinkingBlocksForModel({
    provider: "amazon-bedrock",
    modelId: "anthropic.claude-3-5-sonnet-20241022-v2:0",
  }),
).toBe(true);
```

The indirect coverage via the `resolveProviderCapabilities` snapshot checks `dropThinkingBlockModelHints` is present, but a direct behavioral test is clearer.

How can I resolve this? If you propose a fix, please make it concise.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/agents/pi-embedded-runner.sanitize-session-history.test.ts
Line: 777-786

Comment:
**Missing Bedrock-specific sanitization test**

The `sanitizeAnthropicHistory` helper was designed with an optional `provider` parameter precisely to allow testing both `"anthropic"` and `"amazon-bedrock"`, but no test actually passes `provider: "amazon-bedrock"`. The PR description says "add Bedrock coverage," but at the session-sanitization integration level, Bedrock is only covered by the capability-registry snapshot in `provider-capabilities.test.ts`.

Given that Bedrock uses a different `modelApi` (`"bedrock-converse-stream"` vs `"anthropic-messages"`), a dedicated test would confirm that thinking blocks are also dropped for that code path:

```typescript
it("drops assistant thinking blocks for amazon-bedrock replay", async () => {
  setNonGoogleModelApi();

  const messages = makeThinkingAndTextAssistantMessages();

  const result = await sanitizeAnthropicHistory({
    messages,
    provider: "amazon-bedrock",
  });

  const assistant = getAssistantMessage(result);
  expect(assistant.content).toEqual([{ type: "text", text: "hi" }]);
});
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/provider-capabilities.test.ts
Line: 94-103

Comment:
**No direct `shouldDropThinkingBlocksForModel` assertion for new providers**

The existing test only calls `shouldDropThinkingBlocksForModel` for `"github-copilot"`. Now that `"anthropic"` and `"amazon-bedrock"` both set `dropThinkingBlockModelHints: ["claude"]`, direct behavioral assertions would make the intent explicit and guard against future regressions:

```typescript
expect(
  shouldDropThinkingBlocksForModel({
    provider: "anthropic",
    modelId: "claude-opus-4-6",
  }),
).toBe(true);
expect(
  shouldDropThinkingBlocksForModel({
    provider: "amazon-bedrock",
    modelId: "anthropic.claude-3-5-sonnet-20241022-v2:0",
  }),
).toBe(true);
```

The indirect coverage via the `resolveProviderCapabilities` snapshot checks `dropThinkingBlockModelHints` is present, but a direct behavioral test is clearer.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: d47d5ae}

src/agents/pi-embedded-runner.sanitize-session-history.test.ts

…-thinking-replay-replacement # Conflicts: # CHANGELOG.md

frankekn · 2026-03-13T08:59:49Z

Landed. Thanks @jmcte for the original implementation and fix direction.

Landed in: 5ca0233
Original author commit preserved in replacement flow: d997a3a

@jmcte

* agents: drop Anthropic thinking blocks on replay * fix: extend anthropic replay sanitization openclaw#44429 thanks @jmcte * fix: extend anthropic replay sanitization openclaw#44843 thanks @jmcte * test: add bedrock replay sanitization coverage openclaw#44843 * test: cover anthropic provider drop-thinking hints openclaw#44843 --------- Co-authored-by: johnmteneyckjr <[email protected]>

@jmcte

* agents: drop Anthropic thinking blocks on replay * fix: extend anthropic replay sanitization openclaw#44429 thanks @jmcte * fix: extend anthropic replay sanitization openclaw#44843 thanks @jmcte * test: add bedrock replay sanitization coverage openclaw#44843 * test: cover anthropic provider drop-thinking hints openclaw#44843 --------- Co-authored-by: johnmteneyckjr <[email protected]>

jmcte and others added 2 commits March 13, 2026 15:39

agents: drop Anthropic thinking blocks on replay

d997a3a

fix: extend anthropic replay sanitization #44429 thanks @jmcte

8a883dd

openclaw-barnacle bot added agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels Mar 13, 2026

fix: extend anthropic replay sanitization #44843 thanks @jmcte

d47d5ae

greptile-apps bot reviewed Mar 13, 2026

View reviewed changes

src/agents/pi-embedded-runner.sanitize-session-history.test.ts Show resolved Hide resolved

frankekn added 3 commits March 13, 2026 16:04

test: add bedrock replay sanitization coverage #44843

d03c4c1

test: cover anthropic provider drop-thinking hints #44843

3fee946

Merge remote-tracking branch 'origin/main' into frank/44387-anthropic…

1bb3651

…-thinking-replay-replacement # Conflicts: # CHANGELOG.md

frankekn merged commit 5ca0233 into main Mar 13, 2026
8 checks passed

frankekn deleted the frank/44387-anthropic-thinking-replay-replacement branch March 13, 2026 08:57

This was referenced Mar 13, 2026

📡 Upstream Digest — 2026-03-13 10:28 UTC curtismercier/openclaw-mods#251

Open

上游更新: v2026.3.13 — 16 P0 + 29 P1 待合并 jiulingyun/openclaw-cn#510

Open

github-actions bot mentioned this pull request Mar 15, 2026

📢 OpenClaw 新版本发布: v2026.3.13-1 xianyu110/awesome-openclaw-tutorial#32

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents): drop Anthropic thinking blocks on replay#44843

fix(agents): drop Anthropic thinking blocks on replay#44843
frankekn merged 6 commits intomainfrom
frank/44387-anthropic-thinking-replay-replacement

frankekn commented Mar 13, 2026

Uh oh!

greptile-apps bot commented Mar 13, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

Uh oh!

Uh oh!

frankekn commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

frankekn commented Mar 13, 2026

Summary

Credit

Testing

Uh oh!

greptile-apps bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Comments Outside Diff (1)

Uh oh!

Uh oh!

Uh oh!

frankekn commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps bot commented Mar 13, 2026 •

edited

Loading