-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Description
After upgrading Microsoft.Agents.AI from 1.0.0-preview.260209.1 to 1.0.0-preview.260212.1, OpenTelemetry traces are no longer nesting correctly. Tool execution spans and subsequent LLM call spans become disconnected root traces instead of being children of the parent HTTP request trace.
The root cause is the upgrade of Microsoft.Extensions.AI from 10.2.0 to 10.3.0 (introduced in the agent-framework Directory.Packages.props between the 260209 and 260212 builds). With M.Extensions.AI 10.2.0, traces nest correctly. With 10.3.0, Activity.Current is lost after the first streaming LLM response + tool call cycle, causing all subsequent spans to start new root traces.
Reproduction Steps
- Use
Microsoft.Agents.AI 1.0.0-preview.260212.1(which pullsMicrosoft.Extensions.AI >= 10.3.0) - Configure OpenTelemetry tracing with Aspire dashboard
- Build a
ChatClientAgentwith.UseOpenTelemetry()and tools (e.g., MCP tools) - Send a message that triggers: LLM call → tool call → second LLM call
- Observe traces in the Aspire dashboard
Expected Behavior
All spans should be nested under a single trace ID:
HTTP POST /api/messages (trace: abc123)
└── invoke_agent (trace: abc123)
├── chat gpt-4 [first call] (trace: abc123)
├── execute_tool copilot_chat (trace: abc123)
└── chat gpt-4 [second call] (trace: abc123)
Actual Behavior
After the first LLM call + tool execution, spans become disconnected root traces:
HTTP POST /api/messages (trace: abc123)
└── invoke_agent (trace: abc123)
└── chat gpt-4 [first call] (trace: abc123)
execute_tool copilot_chat (trace: def456) ← NEW ROOT TRACE, parent_span_id=""
chat gpt-4 [second call] (trace: ghi789) ← NEW ROOT TRACE, parent_span_id=""
Bisection Results
| Microsoft.Agents.AI | Microsoft.Extensions.AI | Traces Nested? |
|---|---|---|
| 260209.1 | 10.2.0 | ✅ Yes |
| 260209.1 | 10.3.0 | ❌ No |
| 260212.1 | 10.3.0 | ❌ No |
The issue is caused by M.Extensions.AI 10.3.0. The OpenTelemetryChatClient.cs source code is identical between 10.2.0 and 10.3.0 (same Activity.Current = activity workaround for dotnet/runtime#47802 is present in both), so the regression is likely in a transitive dependency.
Impact
- Cannot upgrade to
M.Agents.AI 260212.1(which contains needed bug fixes like .NET: Fix RunStreamingAsync not including chat history in messages sent to chat client #3798) without breaking distributed tracing 260212.1hard-requiresM.Extensions.AI >= 10.3.0, so pinning to10.2.0is not possible
Environment
- .NET 9.0.13
- ASP.NET Core with Aspire AppHost
- OpenTelemetry (OTLP exporter to Aspire Dashboard)
ChatClientAgentwithUseOpenTelemetry()andUseFunctionInvocation()pipeline
Language/SDK
.NET
Metadata
Metadata
Assignees
Labels
Type
Projects
Status