Conversation
mythical-fred
left a comment
There was a problem hiding this comment.
LGTM. Clean extraction. The move to Option<T> fields with explicit accessors is the right approach for a public, documented struct.
| use serde::{Deserialize, Serialize}; | ||
| use utoipa::ToSchema; | ||
|
|
||
| /// Optional settings for tweaking Feldera internals. |
There was a problem hiding this comment.
Do you want to say something about how these are not guaranteed to be a stable API?
| multihost: if runtime_config.hosts > 1 | ||
| || runtime_config.dev_tweaks.contains_key("multihost") | ||
| { | ||
| multihost: if runtime_config.hosts > 1 { |
There was a problem hiding this comment.
This was an experiment that I used for testing while developing the multihost feature. I forgot about it. It's not needed anymore.
| /// storage, such as object storage, but it could use excessive amounts of | ||
| /// memory if the number of keys fetched is very large. | ||
| #[serde(skip_serializing_if = "Option::is_none")] | ||
| pub fetch_join: Option<bool>, |
There was a problem hiding this comment.
The name does not suggest the same thing that the documentation describes, but it's probably too late to change that.
| /// storage, such as object storage, but it could use excessive amounts of | ||
| /// memory if the number of keys fetched is very large. | ||
| #[serde(skip_serializing_if = "Option::is_none")] | ||
| pub fetch_distinct: Option<bool>, |
| /// each step. | ||
| /// | ||
| /// The default value is 10,000 records. | ||
| // TODO: splitter_chunk_size_bytes, per-operator chunk size. |
There was a problem hiding this comment.
I don't understand this TODO.
There was a problem hiding this comment.
I don't know this either. It is a copy of a @ryzhyk comment, so he might chime in.
There was a problem hiding this comment.
The TODO suggests that we support configuring chunk size in bytes vs records, and that we may want to make it configurable per operator. It's not part of the doc comment. I don't think this is something we need to worry about now.
There was a problem hiding this comment.
I changed this to:
// TODO: It would be better if the value were denominated in bytes rather
// than records, and if it were configurable per-operator.
| #[serde(skip_serializing_if = "Option::is_none")] | ||
| pub adaptive_joins: Option<bool>, | ||
|
|
||
| /// The minimum relative improvement threshold for the join balancer. |
There was a problem hiding this comment.
I think "adaptive" and "balanced" are the same thing, but the names of the fields do not make it clear. Perhaps the comments should.
There was a problem hiding this comment.
I think I need @ryzhyk's feedback on that too.
There was a problem hiding this comment.
We can add this sentence to the doc comment:
"The join balancer is a component that dynamically chooses an optimal partitioning policy for adaptive join operators."
| self.adaptive_joins.unwrap_or(false) | ||
| } | ||
|
|
||
| pub fn balancer_min_relative_improvement_threshold(&self) -> f64 { |
There was a problem hiding this comment.
why sometimes empty lines and sometimes no?
anandbraman
left a comment
There was a problem hiding this comment.
Thanks for documenting!
db740b8 to
62f881d
Compare
Signed-off-by: Ben Pfaff <[email protected]>
Signed-off-by: feldera-bot <[email protected]>
Describe Manual Test Plan
I ran the unit tests.