Skip to content

feat(chatd): add provider-native web search tools to chats#22909

Merged
kylecarbs merged 13 commits intomainfrom
feat/provider-web-search-tools
Mar 11, 2026
Merged

feat(chatd): add provider-native web search tools to chats#22909
kylecarbs merged 13 commits intomainfrom
feat/provider-web-search-tools

Conversation

@kylecarbs
Copy link
Member

@kylecarbs kylecarbs commented Mar 10, 2026

What

Adds provider-native web search tools to the chat system. Anthropic, OpenAI, and Google all offer server-side web search — this wires them up as opt-in per-model config options using the existing ChatModelProviderOptions JSONB column (no migration).

Web search is off by default.

Config

Set web_search_enabled: true in the model config provider options:

{
  "provider_options": {
    "anthropic": {
      "web_search_enabled": true,
      "allowed_domains": ["docs.coder.com", "github.com"]
    }
  }
}

Available options per provider:

  • Anthropic: web_search_enabled, allowed_domains, blocked_domains
  • OpenAI: web_search_enabled, search_context_size (low/medium/high), allowed_domains
  • Google: web_search_enabled

Backend

  • codersdk/chats.go — new fields on the per-provider option structs
  • coderd/chatd/chatd.gobuildProviderTools() reads config, creates ProviderDefinedTool entries (uses anthropic.WebSearchTool() helper from fantasy)
  • coderd/chatd/chatloop/chatloop.goProviderTools on RunOptions, merged into Call.Tools. Provider-executed tool calls skip local execution. StreamPartTypeToolResult with ProviderExecuted: true is accumulated inline (matching fantasy's own agent.go pattern) instead of post-stream synthesis.
  • coderd/chatd/chatprompt/MarshalToolResult carries ProviderMetadata through DB persistence so multi-turn round-trips work (Anthropic needs encrypted_content back)

Frontend

  • Source citations render inline at the tool-call position (not bottom-of-message), using ToolCollapsible so they look like other tool cards — collapsed "Searched N results" with globe icon, expand to see source pills
  • Provider-executed tool calls/results are hidden from the normal tool card UI
  • Tool-role messages with only provider-executed results return null (no empty bubble)
  • Both persisted (messageParsing.ts) and streaming (streamState.ts) paths group consecutive source parts into a single { type: "sources" } render block

Fantasy changes

The fantasy fork (kylecarbs/fantasy branch cj/go1.25) has the Anthropic tool code merged in, but will hopefully go upstream from: charmbracelet/fantasy#163

@coder-tasks
Copy link
Contributor

coder-tasks bot commented Mar 10, 2026

Documentation Check

Updates Needed

  • docs/ai-coder/agents/models.md - The provider-specific options tables are missing the new web search fields. Needs updates to three sections:

    • Anthropic: Add Web Search Enabled, Allowed Domains, Blocked Domains
    • OpenAI: Add Web Search Enabled, Search Context Size (low/medium/high), Allowed Domains
    • Google: Add Web Search Enabled

    ⚠️ Still unaddressed — no documentation changes found in this PR


Automated review via Coder Tasks

@kylecarbs kylecarbs force-pushed the feat/provider-web-search-tools branch 2 times, most recently from 0cd7e0f to 8064fb5 Compare March 11, 2026 15:29
@kylecarbs kylecarbs requested a review from hugodutka March 11, 2026 16:00
@kylecarbs kylecarbs force-pushed the feat/provider-web-search-tools branch 2 times, most recently from 45f6048 to dccb764 Compare March 11, 2026 16:44
Adds configurable web search tool support for Anthropic, OpenAI, and
Google providers. When enabled via per-model configuration, the LLM
provider executes web searches server-side and returns citations.

Configuration is per-model via the existing ChatModelProviderOptions
JSONB column (no DB migration needed):

  - Anthropic: web_search_enabled, allowed_domains, blocked_domains
  - OpenAI: web_search_enabled, search_context_size, allowed_domains
  - Google: web_search_enabled

Implementation:
  - codersdk: Add WebSearchEnabled and related fields to provider
    option structs
  - chatloop: Add ProviderTools field to RunOptions, merge into
    Call.Tools alongside function tools. Filter provider-executed
    tool calls out of local execution — results are already in the
    stream content from the provider. Only continue the tool loop
    when local (non-provider-executed) tool calls exist.
  - chatd: buildProviderTools() reads model config and creates
    fantasy.ProviderDefinedTool entries, wired into chatloop.Run()
  - fantasy (kylecarbs/fantasy@c8d8996): Handle ProviderDefinedTool
    in all three provider adapters. Parse server-side tool results
    and emit SourceContent citations. Anthropic streaming emits
    Source parts for each web_search_tool_result citation.
  - make gen: Updated TypeScript types and chatModelOptions JSON

Web search is off by default (opt-in per model config).
Add frontend support for displaying web search citations when the LLM
uses provider-native web search tools. Source parts arrive from the
backend as { type: "source", url, title } and are now rendered as
compact citation pills with favicons below the response text.

Changes:
- types.ts: Add "source" variant to RenderBlock, add sources array
  to ParsedMessageContent and StreamState
- streamState.ts: Handle "source" parts in streaming accumulator,
  deduplicate by URL
- messageParsing.ts: Handle "source" blocks in message parser,
  deduplicate by URL
- WebSearchSources.tsx: New component rendering citation pills with
  Google S2 favicons, truncated titles, external links. Shows first
  4 with "+N more" expander for overflow.
- ConversationTimeline.tsx: Render WebSearchSources in both
  historical messages and streaming output
- Fix StreamState type in tests/stories to include sources field
Updates the fantasy fork to include filtering of ProviderExecuted tool
results from message history when converting back to provider-specific
API formats. Without this, Anthropic returns an error on the second
message in a conversation that used web search:

  'unexpected tool_use_id found in tool_result blocks'

This also fixes the equivalent issue for OpenAI and Google providers.
When provider-native tool results (web search) are stored in the
database and reloaded, the ProviderExecuted flag was being lost.
This caused Anthropic to reject multi-turn conversations with:

  'unexpected tool_use_id found in tool_result blocks'

The flag was lost at three points in the persistence path:

1. chatprompt: toolResultRaw struct lacked ProviderExecuted field,
   so it was dropped during JSON marshal/unmarshal. Added the field
   with omitempty for backward compatibility.

2. chatloop: toResponseMessages() did not copy ProviderExecuted
   from ToolResultContent to ToolResultPart when converting
   in-memory results.

3. chatd: MarshalToolResult() signature updated to accept the
   providerExecuted parameter, propagated from
   MarshalToolResultContent(). Existing callers pass false.
Provider-native tools (web search) are executed server-side by the
LLM provider. The streaming response includes the tool call
(StreamPartTypeToolCall with ProviderExecuted=true) and source
citations (StreamPartTypeSource), but no StreamPartTypeToolResult.

This meant the tool call was persisted in the assistant message
with no corresponding tool result message. On the next turn, the
provider adapter's toPrompt() correctly filtered the
ProviderExecuted tool call from the assistant message, but Anthropic
still expected a tool_result for any tool_use it saw in the raw
history, causing:

  'unexpected tool_use_id found in tool_result blocks'

Fix: after streaming completes and local tools are executed,
iterate all tool calls and synthesize a ToolResultContent with
ProviderExecuted=true for any provider-executed call. This ensures
the persistence layer stores a matching tool result message. On
reload, both the tool call and tool result are filtered out by the
provider adapter.

Also propagate ProviderExecuted in persistInterruptedStep so
interrupted provider-executed tool calls get correct metadata.
Propagate the ProviderExecuted flag through the SDK type
(ChatMessagePart), chatprompt serialization (PartFromContent for
tool calls and toolResultContentToPart for tool results), SSE
streaming, and DB persistence.

Frontend filtering in both streamState.ts (live streaming) and
messageParsing.ts (page load/DB round-trip) skips tool-call and
tool-result parts with provider_executed=true, so web_search
results only render through the WebSearchSources citation pills.
…ltas

Three missing propagation points:
- db2sdk.contentBlockToPart: ToolCallContent cases were not
  setting ProviderExecuted on the SDK part, so page-load
  rendering of assistant tool-call blocks lost the flag.
- db2sdk.chatMessageParts (tool role): toolResultRow struct
  lacked ProviderExecuted, so it was dropped during JSON
  unmarshalling, and the SDK part omitted it.
- chatloop.processStepStream ToolInputDelta: published
  streaming deltas without ProviderExecuted, which could
  create a tool card before the full ToolCall arrived.
The Anthropic provider had server_tool_use and
web_search_tool_result case branches incorrectly nested inside
the tool_use case in the content_block_stop handler, making
them unreachable. This meant source citations and provider-
executed tool call/result events were never emitted during
streaming, so the frontend never received them.
Sources now appear inline at the position of the web search tool call
in the block sequence, rather than at the bottom of the message. They
use the same ToolCollapsible pattern as other tool cards (globe icon,
'Searched N results' label, chevron expander, source pills on expand).

- types.ts: add 'sources' (plural) RenderBlock variant for grouped sources
- messageParsing.ts: group consecutive source parts into inline blocks
- streamState.ts: same grouping for streaming path
- ConversationTimeline.tsx: render 'sources' blocks via renderBlockList,
  remove bottom-of-message WebSearchSources rendering
- WebSearchSources.tsx: rewrite to use ToolCollapsible with consistent
  tool card styling
- chatd.go: fix unused parameter lint warning
Instead of capturing StreamPartTypeToolResult events into a side map
and synthesizing ToolResultContent after the stream finishes, handle
them directly in processStepStream — matching how fantasy's own
agent.go accumulates provider-executed tool results (lines 1373-1391).

This removes:
- providerToolResults field from stepResult
- The post-stream synthesize loop that iterated provider-executed
  tool calls and manually constructed ToolResultContent with copied
  ProviderMetadata
- The placeholder 'provider-executed tool result' text in Result

The ToolResultContent is now appended to result.content immediately
when the stream event arrives, with ProviderMetadata carried directly
from the StreamPart.
Replace manual ProviderDefinedTool construction with the new
anthropic.WebSearchTool() helper. The helper handles arg construction
internally and will automatically pick up new options (max_uses,
user_location) as they're added upstream.
@kylecarbs kylecarbs force-pushed the feat/provider-web-search-tools branch from dccb764 to 376a4a7 Compare March 11, 2026 18:03
@kylecarbs kylecarbs force-pushed the feat/provider-web-search-tools branch from 376a4a7 to cbbc289 Compare March 11, 2026 18:07
Copy link
Contributor

@hugodutka hugodutka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please address the comments before merging, especially:

  • whether provider tools should bypass active tool filters, and
  • the integration test

otherwise LGTM

ProviderOptions fantasy.ProviderOptions

// ProviderTools are provider-native tools (like web search)
// that are executed server-side by the provider. These are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that are executed server-side by the provider

I don't think this is always true. The computer tool doesn't work this way.

tr := fantasy.ToolResultContent{
ToolCallID: part.ID,
ToolName: part.ToolCallName,
ProviderExecuted: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a ProviderExecuted field on the part, maybe just read that? hardcoding true feels like it might lead to some bug in the future

})
}

func TestWebSearchSourceCitations(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test would be less janky if we generated golden files for the anthropic responses and served a mock http server that acts as the anthropic api and serves the stored responses. the way it's done now it'll flake in CI due to the dependency on an external service

go.mod Outdated
github.com/aquasecurity/trivy-checks v1.12.2-0.20251219190323-79d27547baf5 // indirect
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.4 // indirect
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.17 // indirect
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.5 // indirect; indirect github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.17 // indirect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a malformed entry

/>
))}
</div>
</div>{" "}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like an unintended edit

Comment on lines +53 to +57
| {
type: "source";
url: string;
title: string;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be dead code, only the sources variant is used

return nil
}
var raw map[string]json.RawMessage
if err := json.Unmarshal(r.ProviderMetadata, &raw); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably log a warning if the error occurs. if the ProviderMetadata type ever changes old messages will silently lose their metadata

Comment on lines 841 to 843
// fantasy.Tool slice expected by fantasy.Call. When activeTools
// is non-empty, only tools whose name appears in the list are
// included. This mirrors fantasy's agent.prepareTools filtering.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that comment is inaccurate, providerTools bypass filters. was that intended?

- Fix inaccurate comment on ProviderTools (not always server-side)
- Use part.ProviderExecuted instead of hardcoding true
- Document that provider tools bypass active tool filters
- Fix malformed go.mod entry (two deps on one line)
- Remove unintended {" "} edit in ConversationTimeline
- Remove dead 'source' (singular) type from RenderBlock union
- Add warning log for ProviderMetadata deserialization failures
- Remove integration test hitting real Anthropic API (per review)
@kylecarbs kylecarbs force-pushed the feat/provider-web-search-tools branch from b1bdacb to 73b42c2 Compare March 11, 2026 21:19
@kylecarbs kylecarbs enabled auto-merge (squash) March 11, 2026 21:33
@kylecarbs kylecarbs merged commit 57dc23f into main Mar 11, 2026
27 checks passed
@kylecarbs kylecarbs deleted the feat/provider-web-search-tools branch March 11, 2026 21:33
@github-actions github-actions bot locked and limited conversation to collaborators Mar 11, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants