[dbsp] Promote batches with negative weights more aggressively.#5951
[dbsp] Promote batches with negative weights more aggressively.#5951
Conversation
| fn update_stats(&mut self, weight: &R) { | ||
| if TypeId::of::<R>() == TypeId::of::<DynZWeight>() | ||
| && unsafe { *weight.downcast::<ZWeight>() } < 0 | ||
| { | ||
| self.stats.negative_weight_count += 1; | ||
| } |
There was a problem hiding this comment.
This is kind of a sad special case. Maybe we should generalize Weight enough to allow testing for positive/negative as well as zero.
crates/dbsp/src/trace/spine_async.rs
Outdated
| 8..=128, | ||
| 8..=128, |
There was a problem hiding this comment.
I don't think we can do this because ListMerger has a hard limit of 64 batches at a time.
There was a problem hiding this comment.
This wasn't meant to be in this PR. But I actually wanted to discuss this with you. We no longer limit merge size to 64 (since we don't rely on BitSets in the merger), so I was thinking of increasing the limit for at least level0-level1 batches.
There was a problem hiding this comment.
Oh that's fine then.
You'll need to delete the ListMerger assertion then:
pub fn new(factories: &B::Factories, cursors: Vec<C>) -> Self {
// [IndexSet] supports a maximum of 64 batches.
assert!(cursors.len() <= 64);
There was a problem hiding this comment.
Yep, already discovered these asserts. But I still won't mix this change into this PR.
Benchmark that simulates a pathological scenario where a window operator is followed by an aggregate that computes min over the timestamp column. As the window operator evicts tuples on the left side of the window, 0-weight tuples accumulate in the trace, slowing down the skip_zero_weight_vals_forward operation. Effectively the O(1) min computation becomes O(n), where n is the number of zero-weight tuples in the trace. Signed-off-by: Leonid Ryzhyk <[email protected]>
Merge batches with many negative weights more aggressively. Negative updates are likely to cancel out during merging. Every time this happens, two records are eliminated from the spine, speeding up lookups and reducing the spine's storage footprint. Furthermore, this avoids performance anomalies where an operator spends a lot of time skipping over 0-weight records. We implement this heuristic by accounting each record with a negative weight as N records when calculating the batch's effective length. This should push such batches to the next level more eagerly, until they get merged with batches that contain matching positive-weight records. The value of N is controlled by a new dev tweak `negative_weight_multiplier`. It is set to 0 by default, which means that retractions are not given any additional weight. This table shows how the value of negative_weight_multiplier affects the window_min benchmark, which represents the pathological case of 0-weight records affecting lookups. | multiplier | time(s) | |--------------:|-------------:| | 0 | 56.990349017 | | 1 | 34.751515002 | | 2 | 16.627565868 | | 4 | 18.612214789 | | 6 | 23.141154003 | | 8 | 30.764389275 | Signed-off-by: Leonid Ryzhyk <[email protected]>
13c892e to
d6af78b
Compare
Make sure that deserializing a file written by a version that doesn't support batch metadata, is read with default metadata values (specifically zero-weight count is 0) Signed-off-by: Leonid Ryzhyk <[email protected]>
mythical-fred
left a comment
There was a problem hiding this comment.
LGTM — default means no behavior change until explicitly tuned. The benchmark data in the PR description makes a compelling case for the approach.
Merge batches with many negative weights more aggressively. Negative updates
are likely to cancel out during merging. Every time this happens, two records
are eliminated from the spine, speeding up lookups and reducing the spine's
storage footprint. Furthermore, this avoids performance anomalies where an
operator spends a lot of time skipping over 0-weight records.
We implement this heuristic by accounting each record with a negative weight as
N records when calculating the batch's effective length. This should push such
batches to the next level more eagerly, until they get merged with batches that
contain matching positive-weight records.
The value of N is controlled by a new dev tweak
negative_weight_multiplier.It is set to 0 by default, which means that retractions are not given any
additional weight.
This table shows how the value of negative_weight_multiplier affects the window_min
benchmark, which represents the pathological case of 0-weight records affecting
lookups.