Skip to content

[1/2] feat: active configuration - rework configuration#97

Merged
dmehala merged 7 commits intomainfrom
dmehala/active-config-v2
Mar 13, 2024
Merged

[1/2] feat: active configuration - rework configuration#97
dmehala merged 7 commits intomainfrom
dmehala/active-config-v2

Conversation

@dmehala
Copy link
Copy Markdown
Collaborator

@dmehala dmehala commented Feb 9, 2024

Description

Preparatory work for active configuration. To enhance readability, I have split the work into several PRs.
This particular PR addresses the necessity of distinguishing between default configuration and user/code configuration.

Content:

  • All fields are now optional, enabling us to differentiate default configuration from user configuration.
  • Introduction of TracerDefaultConfig data structure, which defines default configuration settings.
  • Introduction of an overload for finalize_config with TracerDefaultConfig. This enables overwriting a default configuration as shown below:
TracerDefaultConfig new_default;
new_default.service = "nginx";

...

TracerConfig cfg;
if (auto service_name = infer_service_name()) {
   cfg.service = *service_name
}

auto finalized = finalize_configuration(cfg, new_default);

- All fields are now optional. This allows us to distinguish
  default configuration from user configuration.
@dmehala dmehala requested review from cgilmour and dgoffredo February 9, 2024 12:07
}
max_per_second = *maybe_max_per_second;
}
auto max_per_second = config.max_per_second.value_or(200);
Copy link
Copy Markdown
Collaborator Author

@dmehala dmehala Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to admit, that's weird. Three options:

  • Complex data type provide their own finalize free function. They should own the default value to set and also load env conf.
  • finalize_conf is the mastermind, set everything (also default value for complex types). sub finalize free function always trust the input. Risky because human make mistakes.
  • TracerConfig is flat like earth. Make sense for DatadogConfig because 3/4 of it has nothing datadog related.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points, not an easy decision. I will think about it and comment with my thoughts.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question had me reading this PR and its younger sibling a lot.

I think that the best fit here is to keep the same "factoring" in TracerDefaultConfig as is in TracerConfig.

So, that would mean new types TraceSamplerDefaultConfig, DatadogAgentDefaultConfig, etc., that would be members of TracerDefaultConfig.

Then there is the question of which code is responsible for applying environment variables.

How best to do this depends on the specifics of your "part 2." I'll suggest one possibility here. It resembles the ideas we were playing with in our discussions repo.

  • public class TracerConfig
  • public class TracerDefaultConfig
  • public class FinalizedTracerConfig
  • public Expected<FinalizedTracerConfig> finalize_config(const TracerConfig&, const TracerDefaultConfig&);
    • and its overloads
  • for each sub-config within TracerConfig, e.g. TraceSamplerConfig:
    • public class TraceSamplerConfig
    • public class TraceSamplerDefaultConfig
    • public class FinalizedTraceSamplerConfig
    • public Expected<TraceSamplerConfig> trace_sampler_config_from_env();
      • or maybe Expected<void> config_from_env(TraceSamplerConfig&);
    • public Expected<FinalizedTraceSamplerConfig> finalize_config_without_env(const TraceSamplerConfig&);
    • Then, the finalizer for TraceConfig can call trace_sampler_config_from_env() and finalize_config_without_env(...) as needed.

Ah, it burns!

Maybe you see the idea. I'm resisting the temptation to flatten the configuration, because I like the convention that each mechanism (e.g. TraceSampler, DatadogAgent) has its own "finalized config" used in the constructors. The trade-off is this proliferation of types and functions.

Anyway, back to 200. The default rate limit for the trace sampler would be 200 because that would be the default value of some member of a type TraceSamplerDefaultConfig, and that value would be integrated into a FinalizedTraceSamplerConfig by a function finalize_config_without_env, or whatever we choose to call it. finalize_config(const TracerConfig&) would then call this new function.

I have yet to convince myself. The requirement that we be able to identify the "real default" is quite intrusive, it turns out...

@dmehala dmehala changed the title [1/2] feat: active configuration - Rework configuration [1/2] feat: active configuration - rework configuration Feb 9, 2024
Copy link
Copy Markdown
Contributor

@dgoffredo dgoffredo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good.

These are enough thoughts for our next meeting.

I might revisit this more today. Regardless, let's discuss in our meeting tomorrow.

if (auto environment_env = lookup(environment::DD_ENV)) {
assign(result.defaults.environment, *environment_env);
final_cfg.clock = clock;
final_cfg.defaults.service = value_or(env->service, user_config.service, "");
Copy link
Copy Markdown
Contributor

@dgoffredo dgoffredo Feb 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to require service. Maybe we've discussed this before, but I like to force a service name instead of depending on whatever behavior an empty string results in. Since most users are using our proxies, the point is mostly moot. For the hypothetical programmatic user, though, why not enforce a context-dependent default? They're writing the code already.

Regardless, I think that we can put service into TracerDefaultConfig. Envoy and nginx can then set TracerDefaultConfig::service to "envoy" and "nginx", respectively, but might also set the value in TracerConfig per user configuration (e.g. datadog_service_name foo; or Envoy's service_name in the YAML config).

return error->with_prefix(prefix);
}
result.defaults.tags = std::move(*tags);
TraceSamplerConfig ts;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to avoid short names like ts unless it's going to be repeated multiple times per line.

On the one hand, Go has the convention of using small local names like ts, and I've heard that longer names do not reduce defect rates.

On the other hand, when I'm skimming the code, I see ts and wonder "what's that?" At least with trace_sampler, I know that it has something to do with the trace sampler. It's a small difference, though, because if I'm really interested in what's going on I have to find the definition of the variable. Well, maybe easier to grep if it's a longer name.

This is the sort of thing that we can put down in our (future) coding standards. I won't belabor it any more here.

}
max_per_second = *maybe_max_per_second;
}
auto max_per_second = config.max_per_second.value_or(200);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question had me reading this PR and its younger sibling a lot.

I think that the best fit here is to keep the same "factoring" in TracerDefaultConfig as is in TracerConfig.

So, that would mean new types TraceSamplerDefaultConfig, DatadogAgentDefaultConfig, etc., that would be members of TracerDefaultConfig.

Then there is the question of which code is responsible for applying environment variables.

How best to do this depends on the specifics of your "part 2." I'll suggest one possibility here. It resembles the ideas we were playing with in our discussions repo.

  • public class TracerConfig
  • public class TracerDefaultConfig
  • public class FinalizedTracerConfig
  • public Expected<FinalizedTracerConfig> finalize_config(const TracerConfig&, const TracerDefaultConfig&);
    • and its overloads
  • for each sub-config within TracerConfig, e.g. TraceSamplerConfig:
    • public class TraceSamplerConfig
    • public class TraceSamplerDefaultConfig
    • public class FinalizedTraceSamplerConfig
    • public Expected<TraceSamplerConfig> trace_sampler_config_from_env();
      • or maybe Expected<void> config_from_env(TraceSamplerConfig&);
    • public Expected<FinalizedTraceSamplerConfig> finalize_config_without_env(const TraceSamplerConfig&);
    • Then, the finalizer for TraceConfig can call trace_sampler_config_from_env() and finalize_config_without_env(...) as needed.

Ah, it burns!

Maybe you see the idea. I'm resisting the temptation to flatten the configuration, because I like the convention that each mechanism (e.g. TraceSampler, DatadogAgent) has its own "finalized config" used in the constructors. The trade-off is this proliferation of types and functions.

Anyway, back to 200. The default rate limit for the trace sampler would be 200 because that would be the default value of some member of a type TraceSamplerDefaultConfig, and that value would be integrated into a FinalizedTraceSamplerConfig by a function finalize_config_without_env, or whatever we choose to call it. finalize_config(const TracerConfig&) would then call this new function.

I have yet to convince myself. The requirement that we be able to identify the "real default" is quite intrusive, it turns out...

@dmehala dmehala force-pushed the dmehala/active-config-v2 branch from 8ac4403 to 254618f Compare February 15, 2024 11:11
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Feb 15, 2024

Benchmarks

Benchmark execution time: 2024-03-13 15:31:37

Comparing candidate commit 9f81de6 in PR branch dmehala/active-config-v2 with baseline commit b286465 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

@dmehala dmehala force-pushed the dmehala/active-config-v2 branch from 628141a to 04a350f Compare February 15, 2024 11:43
Copy link
Copy Markdown
Contributor

@dgoffredo dgoffredo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our discussion Friday, you can remove the new Optional<bool>, and instead do something in FinalizedTracerConfig in part 2 of these changes.

Other than that, this looks good. I'm glad we found a solution that isn't too intrusive.

:shipit:

@dmehala dmehala merged commit e9aa08a into main Mar 13, 2024
@dmehala dmehala deleted the dmehala/active-config-v2 branch March 13, 2024 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants