Emit gen_ai.client.operation.exception via ILogger LoggerMessage on OpenTelemetry instrumentation classes#7379
Conversation
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryChatClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryChatClient.cs
Outdated
Show resolved
Hide resolved
test/Libraries/Microsoft.Extensions.AI.Tests/ChatCompletion/OpenTelemetryChatClientTests.cs
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
This PR extends the Microsoft.Extensions.AI OpenTelemetry instrumentation to emit a dedicated exception ActivityEvent (with exception.* tags) when operations fail, and adds unit tests across the AI client surface area to validate the new telemetry behavior.
Changes:
- Add a new OpenTelemetry event name constant and
exception.type/exception.message/exception.stacktracetag constants. - Emit a
gen_ai.client.operation.exceptionActivityEvent on error for Chat, Image, Embeddings, Hosted Files, and Speech-to-Text telemetry wrappers. - Add new tests asserting the exception event is emitted and existing error tags/status are preserved.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Libraries/Microsoft.Extensions.AI/OpenTelemetryConsts.cs | Adds constants for exception event name and exception.* tag keys. |
| src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryChatClient.cs | Emits exception ActivityEvent when a chat operation throws. |
| src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryImageGenerator.cs | Emits exception ActivityEvent when image generation throws. |
| src/Libraries/Microsoft.Extensions.AI/Embeddings/OpenTelemetryEmbeddingGenerator.cs | Emits exception ActivityEvent when embedding generation throws. |
| src/Libraries/Microsoft.Extensions.AI/Files/OpenTelemetryHostedFileClient.cs | Emits exception ActivityEvent when hosted file operations throw. |
| src/Libraries/Microsoft.Extensions.AI/SpeechToText/OpenTelemetrySpeechToTextClient.cs | Emits exception ActivityEvent when speech-to-text operations throw. |
| test/Libraries/Microsoft.Extensions.AI.Tests/ChatCompletion/OpenTelemetryChatClientTests.cs | Adds tests validating exception ActivityEvent for streaming/non-streaming chat calls. |
| test/Libraries/Microsoft.Extensions.AI.Tests/Image/OpenTelemetryImageGeneratorTests.cs | Adds test validating exception ActivityEvent for image generation. |
| test/Libraries/Microsoft.Extensions.AI.Tests/Embeddings/OpenTelemetryEmbeddingGeneratorTests.cs | Adds test validating exception ActivityEvent for embedding generation. |
| test/Libraries/Microsoft.Extensions.AI.Tests/Files/OpenTelemetryHostedFileClientTests.cs | Adds test validating exception ActivityEvent for file upload. |
| test/Libraries/Microsoft.Extensions.AI.Tests/SpeechToText/OpenTelemetrySpeechToTextClientTests.cs | Adds tests validating exception ActivityEvent for streaming/non-streaming STT calls. |
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryChatClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI/SpeechToText/OpenTelemetrySpeechToTextClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI/Embeddings/OpenTelemetryEmbeddingGenerator.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI/Files/OpenTelemetryHostedFileClient.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryImageGenerator.cs
Outdated
Show resolved
Hide resolved
|
@copilot should Activity.AddException get used instead of AddEvent? The difference is if there is any registered ActivityListner, can get the notification of the exception registration. |
|
@stephentoub I confirmed as we previously discussed, according to the OTel plan, the exception should be reported through the log event and not the span/activity event. I think we should follow that. open-telemetry/semantic-conventions#3436 (comment) |
|
@tarekgh what does that look like? Are you saying this should use ILogger, and if so, can you share what the call site should look like? |
|
I believe the following should do it. OpenTelelmtry exporter should handle the exception correctly and emit its parts according to the specs. [LoggerMessage(
EventName = "gen_ai.client.operation.exception",
Level = LogLevel.Warning,
Message = "A GenAI client operation exception occurred.")]
private static partial void LogOperationException(ILogger logger, Exception error);and the call site will be like: if (error is not null)
{
_ = activity?
.AddTag(OpenTelemetryConsts.Error.Type, error.GetType().FullName)
.SetStatus(ActivityStatusCode.Error, error.Message);
if (_logger is not null)
{
LogOperationException(_logger, error);
}
} |
Ok, thanks. I'd initially planned on going that route, but copilot suggested it didn't work:
@copilot, please redo this PR, revising it to store the ILogger in the ctor into a field in each of the implementations, and then following the approach that @tarekgh outlines in #7379 (comment). |
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryChatClient.cs
Outdated
Show resolved
Hide resolved
Redone in 7aa2b78. All five OTel classes now store the [LoggerMessage(
EventName = "gen_ai.client.operation.exception",
Level = LogLevel.Warning,
Message = "A GenAI client operation exception occurred.")]
private static partial void LogOperationException(ILogger logger, Exception error);Call site: _ = activity?
.AddTag(OpenTelemetryConsts.Error.Type, error.GetType().FullName)
.SetStatus(ActivityStatusCode.Error, error.Message);
if (_logger is not null)
{
LogOperationException(_logger, error);
}Removed the
... Implemented exactly this pattern in 7aa2b78 across all five OTel classes. The |
src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryChatClient.cs
Outdated
Show resolved
Hide resolved
|
@copilot please also update the corresponding opentelemetry support for ITextToSpeechClient, and add a line for the whole PR to the Microsoft.Extensions.AI changelog file under the not yet released section. |
Done in fdedf65. |
|
@copilot resolve the conflicts |
…y instrumentation classes Co-authored-by: stephentoub <[email protected]>
…ent error tests Co-authored-by: stephentoub <[email protected]>
…reporting per OTel convention Store ILogger in _logger field in all five OpenTelemetry classes, add [LoggerMessage] partial method with EventName="gen_ai.client.operation.exception", replace Activity.AddEvent with LogOperationException call. Remove unused exception.* constants from OpenTelemetryConsts. Update all tests to verify ILogger output via FakeLogCollector. Co-authored-by: stephentoub <[email protected]>
Co-authored-by: tarekgh <[email protected]>
Co-authored-by: tarekgh <[email protected]>
Co-authored-by: tarekgh <[email protected]>
Co-authored-by: tarekgh <[email protected]>
Co-authored-by: tarekgh <[email protected]>
Co-authored-by: tarekgh <[email protected]>
Co-authored-by: tarekgh <[email protected]>
Co-authored-by: stephentoub <[email protected]>
c3ce851 to
ca86a76
Compare
…penTelemetry instrumentation classes (dotnet#7379) Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: stephentoub <[email protected]> Co-authored-by: tarekgh <[email protected]>
OpenTelemetryLogclass with[LoggerMessage]method_loggerfield andOpenTelemetryLog.OperationExceptioncall toOpenTelemetryChatClient_loggerfield andOpenTelemetryLog.OperationExceptioncall toOpenTelemetryEmbeddingGenerator_loggerfield andOpenTelemetryLog.OperationExceptioncall toOpenTelemetrySpeechToTextClient_loggerfield andOpenTelemetryLog.OperationExceptioncall toOpenTelemetryImageGenerator_loggerfield andOpenTelemetryLog.OperationExceptioncall toOpenTelemetryHostedFileClient_loggerfield andOpenTelemetryLog.OperationExceptioncall toOpenTelemetryRealtimeClientSession_loggerfield andOpenTelemetryLog.OperationExceptioncall toOpenTelemetryTextToSpeechClientOpenTelemetryLog.cstoCommonfolderexception.*constants andExceptionEventNamefromOpenTelemetryConstsOriginal prompt
Background
The OpenTelemetry semantic conventions have added a new
gen_ai.client.operation.exceptionevent for recording exceptions during Generative AI client operations. This was merged in open-telemetry/semantic-conventions#3436.The new convention specifies:
gen_ai.client.operation.exceptionexception.type(conditionally required) — fully-qualified exception class nameexception.message(conditionally required) — the exception messageexception.stacktrace(recommended) — stack trace as a stringThis is part of a broader initiative to migrate exception recording from span events to log-based events. The convention notes that
exception.messagemay contain sensitive information.What needs to change
1. All four OpenTelemetry instrumentation classes need to emit the exception event
The following classes all have
ILogger? loggerparameters in their constructors that are currently unused (suppressed with#pragma warning disable IDE0060):OpenTelemetryChatClientinsrc/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryChatClient.csOpenTelemetryEmbeddingGenerator<TInput, TEmbedding>insrc/Libraries/Microsoft.Extensions.AI/Embeddings/OpenTelemetryEmbeddingGenerator.csOpenTelemetrySpeechToTextClientinsrc/Libraries/Microsoft.Extensions.AI/SpeechToText/OpenTelemetrySpeechToTextClient.csOpenTelemetryImageGeneratorinsrc/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryImageGenerator.csFor each class:
a. Store and use the
ILoggerparameterprivate readonly ILogger? _logger;fieldloggerparameter in the constructor:_logger = logger;#pragma warning disable/restore IDE0060around the constructor since the parameter is now usedb. Emit the exception event in the
TraceResponsemethodIn each class's
TraceResponsemethod, whenerror is not null, in addition to the existing span status/tag logic, emit the exception event usingActivity.AddEvent. UseActivity.AddEventbecause the .NETILoggerdoes not natively support the OTel event model with event names in the way the convention requires (the event name must begen_ai.client.operation.exception, and ILogger doesn't map cleanly to OTel log events with specific event names in the current .NET OTel SDK). UsingActivity.AddEventis the pragmatic choice that matches how .NET OTel instrumentation works today.The event should be added to the activity like this:
This should be placed inside the existing
if (error is not null)block inTraceResponse, alongside (not replacing) the existingerror.typetag andSetStatus(ActivityStatusCode.Error)calls.c. Add the event name constant to
OpenTelemetryConstsIn
src/Libraries/Microsoft.Extensions.AI/OpenTelemetryConsts.cs, add to theGenAI.Clientclass:And add exception attribute constants:
Use these constants in all four classes instead of string literals.
2. Existing code reference for each class
OpenTelemetryChatClient — current error handling in
TraceResponse(lines 710-715):OpenTelemetryEmbeddingGenerator — similar pattern in its
TraceResponsemethod:OpenTelemetrySpeechToTextClient — similar pattern in its
TraceResponsemethod.OpenTelemetryImageGenerator — similar pattern in its
TraceResponsemethod.3. Tests
Add or update tests for each class to verify that when an exception occurs:
error.typetag and error status are...This pull request was created from Copilot chat.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.
Microsoft Reviewers: Open in CodeFlow