Skip to content

fix(rc): improve trace sampling rules parsing#232

Merged
dmehala merged 1 commit intomainfrom
dmehala/fix-rc-sampling-2
Sep 3, 2025
Merged

fix(rc): improve trace sampling rules parsing#232
dmehala merged 1 commit intomainfrom
dmehala/fix-rc-sampling-2

Conversation

@dmehala
Copy link
Copy Markdown
Collaborator

@dmehala dmehala commented Aug 15, 2025

Correctly parse tags from remote trace sampling rules and improves overall rule parsing reliability through stricter validation and clearer error reporting.

Changes:

  • Add support for tags in trace sampling rules.
  • Improve validation with explicit type checks and more descriptive error messages.
  • Update catch-all rule handling to replace an existing one instead of always prepending, ensuring consistent rule ordering.
  • Extended system test request handler to support span_tags from incoming requests.

APMAPI-863
APMAPI-866

Correctly parse `tags` from remote trace sampling rules and improves
overall rule parsing reliability through stricter validation and clearer
error reporting.

Changes:
  - Add support for `tags` in trace sampling rules.
  - Improve validation with explicit type checks and more descriptive
    error messages.
  - Update catch-all rule handling to replace an existing one instead of
    always prepending, ensuring consistent rule ordering.
  - Extended system test request handler to support `span_tags` from
    incoming requests.

[APMAPI-863]
[APMAPI-866]
@dmehala dmehala requested a review from a team as a code owner August 15, 2025 17:24
@dmehala dmehala requested review from cataphract and zacharycmontoya and removed request for a team August 15, 2025 17:24
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Aug 15, 2025

Benchmarks

Benchmark execution time: 2025-08-15 17:26:14

Comparing candidate commit e6e4b8f in PR branch dmehala/fix-rc-sampling-2 with baseline commit bb4891d in branch main.

Found 0 performance improvements and 1 performance regressions! Performance is the same for 0 metrics, 0 unstable metrics.

scenario:BM_TraceTinyCCSource

  • 🟥 execution_time [+2.252ms; +2.424ms] or [+3.036%; +3.269%]

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Aug 15, 2025

Codecov Report

❌ Patch coverage is 50.00000% with 61 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@bb4891d). Learn more about missing BASE report.
⚠️ Report is 60 commits behind head on main.

Files with missing lines Patch % Lines
src/datadog/config_manager.cpp 46.49% 61 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #232   +/-   ##
=======================================
  Coverage        ?   86.64%           
=======================================
  Files           ?       83           
  Lines           ?     5427           
  Branches        ?        0           
=======================================
  Hits            ?     4702           
  Misses          ?      725           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@zacharycmontoya zacharycmontoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice!

Expected<TraceSamplerRule> parse_rule(const nlohmann::json& json_rule) {
assert(json_rule.is_object());

auto make_error = [&json_rule](StringView field_name) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto make_error = [&json_rule](StringView field_name) {
auto make_missing_error = [&json_rule](StringView field_name) {

nit: as we specify for property it might be easier to read to specify here as well

Comment on lines +215 to +219
const auto old_sampler_cfg =
config_manager.trace_sampler()->config_json();
const auto err = config_manager.on_update(config_update);
const auto new_sampler_cfg =
config_manager.trace_sampler()->config_json();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check that the right values have been set as well ? Currently if an error happened, it might tell that old_sampler_cfg != new_sampler_cfg but maybe the values are not the right ones

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your question related to the trace sampler values? If so, config_json() returns a JSON with said values, if there's a change, then the JSON from old/new_sampler_cfg will differ.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not understand everything clearly but from my understanding it is not because there was a change that the right values were set

Copy link
Copy Markdown
Collaborator Author

@dmehala dmehala Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding from your take is that, it would be preferable to check the rules in the JSON payload are equals to the rules in new_sampler_cfg? For example, we should make sure there's a rule for service foo, resource GET /hello and so on?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can check that there is indeed a rule for the resource get /hello with the right value/tags, it ensures the correctness of the apply. That may be what you are doing in #234 though

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Currently we are not validating the correctness because the TraceSampler interface doesn't provide a way to retrieve the rules, and I don't think there's any meaning except for testing purposes. I am planning to revamp the configuration part, one of the requirement is to be able to validate the config sent to telemetry. This would help to validate the correctness.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, perfect. Ping me if you need an additional hand.

@dmehala dmehala merged commit 274149a into main Sep 3, 2025
24 checks passed
@dmehala dmehala deleted the dmehala/fix-rc-sampling-2 branch September 3, 2025 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants