Skip to content

Add OTLP output for OpenTelemetry Protocol integration#780

Merged
karimra merged 7 commits intoopenconfig:mainfrom
drewelliott:drew/sync-otlp-improvements
Jan 25, 2026
Merged

Add OTLP output for OpenTelemetry Protocol integration#780
karimra merged 7 commits intoopenconfig:mainfrom
drewelliott:drew/sync-otlp-improvements

Conversation

@drewelliott
Copy link
Contributor

Summary

Adds native OpenTelemetry Protocol (OTLP) output support to gNMIc, enabling direct export of telemetry data to OTLP-compatible backends.

  • New OTLP output plugin with direct OTLP/gRPC export
  • Full metric conversion (gauges, counters, histograms) from gNMI to OTLP
  • Support for custom resource attributes and semantic conventions
  • Comprehensive test coverage included

🤖 Generated with Claude Code

@google-cla
Copy link

google-cla bot commented Dec 4, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@drewelliott drewelliott force-pushed the drew/sync-otlp-improvements branch from 45e1e97 to ea4f8d7 Compare December 4, 2025 19:52
Implements native OTLP/gRPC output enabling direct gNMIC → OTEL Collector
integration without intermediate components (NATS/Prometheus/Kafka).

Features:
- gRPC transport with TLS support
- Automatic metric type detection (Counter/Gauge based on path heuristics)
- Proper gNMI path → OTLP metric name conversion (slash/hyphen to underscore)
- Configurable subscription name prepending for vendor-specific prefixes
- Event.Values iteration supporting any value key structure
- Comprehensive validation of OTLP message structure
- Response PartialSuccess checking for rejection detection
- Configurable batching with worker pools
- Retry logic with exponential backoff
- Prometheus metrics for observability

Configuration options:
- endpoint: OTLP collector endpoint (required)
- protocol: "grpc" or "http" (default: grpc)
- timeout: Request timeout (default: 10s)
- batch-size: Metrics per batch (default: 1000)
- interval: Max time before sending batch (default: 5s)
- num-workers: Worker pool size (default: 1)
- max-retries: Retry attempts (default: 3)
- append-subscription-name: Prepend subscription name to metrics (default: false)
- strings-as-attributes: Convert string values to gauge with attribute (default: false)
- metric-prefix: Global prefix for all metrics (optional)
- resource-attributes: Static resource attributes (optional)
- tls: TLS configuration (optional)

Example configuration:
outputs:
  otlp:
    type: otlp
    endpoint: otel-collector:4317
    protocol: grpc
    batch-size: 1000
    num-workers: 2
    append-subscription-name: true
    strings-as-attributes: true
    resource-attributes:
      telemetry.source: "gnmi"

Metric naming:
- [prefix_][subscription_]path_with_underscores
- Example: nvos_interfaces_interface_state_counters_in_octets
- Complies with OTLP naming conventions (a-z, 0-9, _, .)
This commit syncs all OTLP output improvements from the pylon-platform
fork to the upstream GitHub repository:

1. Add add-event-tags-as-attributes config option
   - Makes event-tags appear as Prometheus labels in OTLP output
   - Critical for Panoptes integration with standard label names

2. Fix critical event.Values handling
   - Accept ANY value key format from gNMI devices
   - Improves compatibility with diverse network equipment

3. Comprehensive OTLP validation and error handling
   - PartialSuccess response handling
   - Data point validation before export
   - Debug logging for troubleshooting

4. Metric naming improvements
   - Convert path slashes to underscores
   - Prepend subscription name when configured
   - Ensures valid Prometheus metric names

All changes maintain backward compatibility and are production-tested
in NVIDIA Pylon platform with 15-pod gNMIC clusters.
@drewelliott drewelliott force-pushed the drew/sync-otlp-improvements branch 3 times, most recently from 9a8d807 to 7a6d22f Compare January 8, 2026 18:51
Updated all.go to have distinct copyright blocks for Nokia and NVIDIA
rather than combining them, as requested by the maintainers.
@drewelliott drewelliott force-pushed the drew/sync-otlp-improvements branch from 7a6d22f to e41e189 Compare January 8, 2026 18:55
Updates the OTLP output to use grpc.NewClient instead of the deprecated
grpc.DialContext. This change ensures the output can initialize successfully
even when the OTLP collector endpoint is unreachable at startup.

Key changes:
- Use grpc.NewClient which creates client without dialing
- Connection now happens lazily on first RPC instead of during Init
- Allows OTLP output to start and retry when endpoint becomes available
- Remove unnecessary timeout context from initialization

Adds comprehensive unit tests to verify:
- Init succeeds with unreachable endpoints
- Connection established on first RPC call
- Reconnection works when endpoint becomes available
- Normal operation with reachable endpoints
Addresses PR feedback about unused ctx parameter in sendBatch. The context
was being passed but never used, which would cause data loss during shutdown
when the worker context is cancelled.

Changes:
- Worker creates fresh context with timeout for final batch flush on shutdown
- sendBatch now passes context through to sendGRPC
- sendGRPC accepts and uses the passed context parameter
- Prevents sending with cancelled context during graceful shutdown

Adds comprehensive tests:
- TestOTLP_GracefulShutdownFlushes - verifies final batch sent on Close()
- TestOTLP_ContextCancellationFlushes - ensures flush works after cancellation
- TestOTLP_ChannelCloseFlushes - validates flush when channel closes
- Updates mock server to reject cancelled contexts like real gRPC

This ensures no data loss during shutdown, restart, or scaling operations.
@karimra karimra merged commit d8581a4 into openconfig:main Jan 25, 2026
3 checks passed
@karimra
Copy link
Collaborator

karimra commented Jan 25, 2026

Thanks for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants