Skip to content

.NET: OpenTelemetry trace nesting broken when using Microsoft.Extensions.AI 10.3.0 (Activity.Current lost after streaming + tool calls) #4074

@flaviocdc

Description

@flaviocdc

Description

After upgrading Microsoft.Agents.AI from 1.0.0-preview.260209.1 to 1.0.0-preview.260212.1, OpenTelemetry traces are no longer nesting correctly. Tool execution spans and subsequent LLM call spans become disconnected root traces instead of being children of the parent HTTP request trace.

The root cause is the upgrade of Microsoft.Extensions.AI from 10.2.0 to 10.3.0 (introduced in the agent-framework Directory.Packages.props between the 260209 and 260212 builds). With M.Extensions.AI 10.2.0, traces nest correctly. With 10.3.0, Activity.Current is lost after the first streaming LLM response + tool call cycle, causing all subsequent spans to start new root traces.

Reproduction Steps

  1. Use Microsoft.Agents.AI 1.0.0-preview.260212.1 (which pulls Microsoft.Extensions.AI >= 10.3.0)
  2. Configure OpenTelemetry tracing with Aspire dashboard
  3. Build a ChatClientAgent with .UseOpenTelemetry() and tools (e.g., MCP tools)
  4. Send a message that triggers: LLM call → tool call → second LLM call
  5. Observe traces in the Aspire dashboard

Expected Behavior

All spans should be nested under a single trace ID:

HTTP POST /api/messages (trace: abc123)
  └── invoke_agent (trace: abc123)
       ├── chat gpt-4 [first call] (trace: abc123)
       ├── execute_tool copilot_chat (trace: abc123)
       └── chat gpt-4 [second call] (trace: abc123)

Actual Behavior

After the first LLM call + tool execution, spans become disconnected root traces:

HTTP POST /api/messages (trace: abc123)
  └── invoke_agent (trace: abc123)
       └── chat gpt-4 [first call] (trace: abc123)

execute_tool copilot_chat (trace: def456)  ← NEW ROOT TRACE, parent_span_id=""

chat gpt-4 [second call] (trace: ghi789)  ← NEW ROOT TRACE, parent_span_id=""

Bisection Results

Microsoft.Agents.AI Microsoft.Extensions.AI Traces Nested?
260209.1 10.2.0 ✅ Yes
260209.1 10.3.0 ❌ No
260212.1 10.3.0 ❌ No

The issue is caused by M.Extensions.AI 10.3.0. The OpenTelemetryChatClient.cs source code is identical between 10.2.0 and 10.3.0 (same Activity.Current = activity workaround for dotnet/runtime#47802 is present in both), so the regression is likely in a transitive dependency.

Impact

Environment

  • .NET 9.0.13
  • ASP.NET Core with Aspire AppHost
  • OpenTelemetry (OTLP exporter to Aspire Dashboard)
  • ChatClientAgent with UseOpenTelemetry() and UseFunctionInvocation() pipeline

Language/SDK

.NET

Metadata

Metadata

Assignees

Labels

.NETv1.0Features being tracked for the version 1.0 GA

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions