Add OTLP output for OpenTelemetry Protocol integration#780
Merged
karimra merged 7 commits intoopenconfig:mainfrom Jan 25, 2026
Merged
Add OTLP output for OpenTelemetry Protocol integration#780karimra merged 7 commits intoopenconfig:mainfrom
karimra merged 7 commits intoopenconfig:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
45e1e97 to
ea4f8d7
Compare
Implements native OTLP/gRPC output enabling direct gNMIC → OTEL Collector
integration without intermediate components (NATS/Prometheus/Kafka).
Features:
- gRPC transport with TLS support
- Automatic metric type detection (Counter/Gauge based on path heuristics)
- Proper gNMI path → OTLP metric name conversion (slash/hyphen to underscore)
- Configurable subscription name prepending for vendor-specific prefixes
- Event.Values iteration supporting any value key structure
- Comprehensive validation of OTLP message structure
- Response PartialSuccess checking for rejection detection
- Configurable batching with worker pools
- Retry logic with exponential backoff
- Prometheus metrics for observability
Configuration options:
- endpoint: OTLP collector endpoint (required)
- protocol: "grpc" or "http" (default: grpc)
- timeout: Request timeout (default: 10s)
- batch-size: Metrics per batch (default: 1000)
- interval: Max time before sending batch (default: 5s)
- num-workers: Worker pool size (default: 1)
- max-retries: Retry attempts (default: 3)
- append-subscription-name: Prepend subscription name to metrics (default: false)
- strings-as-attributes: Convert string values to gauge with attribute (default: false)
- metric-prefix: Global prefix for all metrics (optional)
- resource-attributes: Static resource attributes (optional)
- tls: TLS configuration (optional)
Example configuration:
outputs:
otlp:
type: otlp
endpoint: otel-collector:4317
protocol: grpc
batch-size: 1000
num-workers: 2
append-subscription-name: true
strings-as-attributes: true
resource-attributes:
telemetry.source: "gnmi"
Metric naming:
- [prefix_][subscription_]path_with_underscores
- Example: nvos_interfaces_interface_state_counters_in_octets
- Complies with OTLP naming conventions (a-z, 0-9, _, .)
This commit syncs all OTLP output improvements from the pylon-platform fork to the upstream GitHub repository: 1. Add add-event-tags-as-attributes config option - Makes event-tags appear as Prometheus labels in OTLP output - Critical for Panoptes integration with standard label names 2. Fix critical event.Values handling - Accept ANY value key format from gNMI devices - Improves compatibility with diverse network equipment 3. Comprehensive OTLP validation and error handling - PartialSuccess response handling - Data point validation before export - Debug logging for troubleshooting 4. Metric naming improvements - Convert path slashes to underscores - Prepend subscription name when configured - Ensures valid Prometheus metric names All changes maintain backward compatibility and are production-tested in NVIDIA Pylon platform with 15-pod gNMIC clusters.
9a8d807 to
7a6d22f
Compare
Updated all.go to have distinct copyright blocks for Nokia and NVIDIA rather than combining them, as requested by the maintainers.
7a6d22f to
e41e189
Compare
karimra
reviewed
Jan 13, 2026
karimra
requested changes
Jan 13, 2026
Updates the OTLP output to use grpc.NewClient instead of the deprecated grpc.DialContext. This change ensures the output can initialize successfully even when the OTLP collector endpoint is unreachable at startup. Key changes: - Use grpc.NewClient which creates client without dialing - Connection now happens lazily on first RPC instead of during Init - Allows OTLP output to start and retry when endpoint becomes available - Remove unnecessary timeout context from initialization Adds comprehensive unit tests to verify: - Init succeeds with unreachable endpoints - Connection established on first RPC call - Reconnection works when endpoint becomes available - Normal operation with reachable endpoints
Addresses PR feedback about unused ctx parameter in sendBatch. The context was being passed but never used, which would cause data loss during shutdown when the worker context is cancelled. Changes: - Worker creates fresh context with timeout for final batch flush on shutdown - sendBatch now passes context through to sendGRPC - sendGRPC accepts and uses the passed context parameter - Prevents sending with cancelled context during graceful shutdown Adds comprehensive tests: - TestOTLP_GracefulShutdownFlushes - verifies final batch sent on Close() - TestOTLP_ContextCancellationFlushes - ensures flush works after cancellation - TestOTLP_ChannelCloseFlushes - validates flush when channel closes - Updates mock server to reject cancelled contexts like real gRPC This ensures no data loss during shutdown, restart, or scaling operations.
karimra
approved these changes
Jan 22, 2026
Collaborator
|
Thanks for the contribution! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds native OpenTelemetry Protocol (OTLP) output support to gNMIc, enabling direct export of telemetry data to OTLP-compatible backends.
🤖 Generated with Claude Code