[Data] Demote Sort from requiring preserve_order#60555
[Data] Demote Sort from requiring preserve_order#60555alexeykudinkin merged 2 commits intomasterfrom
Sort from requiring preserve_order#60555Conversation
Signed-off-by: Alexey Kudinkin <[email protected]>
There was a problem hiding this comment.
Code Review
The pull request successfully removes Sort from the list of logical operators that implicitly require order preservation. This change aligns with the stated goal of demoting Sort from always enforcing preserve_order=True in the execution plan. The modifications are clear and directly implement the described intent.
iamjustinhsu
left a comment
There was a problem hiding this comment.
do u know why Sort was included originally?
Signed-off-by: Alexey Kudinkin <[email protected]>
|
|
||
| for op in self._logical_plan.dag.post_order_iter(): | ||
| if isinstance(op, (Zip, Sort)): | ||
| if isinstance(op, Zip): |
There was a problem hiding this comment.
Sort output may return blocks out of order
Medium Severity
Removing Sort from require_preserve_order() can cause sorted data to be returned out of order when downstream MapOperators process blocks in parallel. The Sort operation establishes global ordering (block 0 has smallest values, block 1 has next range, etc.), but without preserve_order=True, downstream operators use FIFOBundleQueue which returns blocks in completion order rather than input order. If a later block completes before an earlier one, the sorted order is broken.
## Description This removes the requirement for pipelines having `Sort` operations to actually require `preserve_order=True`. This is an unnecessary strict requirement that has adversarial side-effects, and is strictly not required as there's no global ordering between the blocks established. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <[email protected]> Signed-off-by: jinbum-kim <[email protected]>
## Description This removes the requirement for pipelines having `Sort` operations to actually require `preserve_order=True`. This is an unnecessary strict requirement that has adversarial side-effects, and is strictly not required as there's no global ordering between the blocks established. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <[email protected]>
## Description This removes the requirement for pipelines having `Sort` operations to actually require `preserve_order=True`. This is an unnecessary strict requirement that has adversarial side-effects, and is strictly not required as there's no global ordering between the blocks established. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <[email protected]> Signed-off-by: 400Ping <[email protected]>
## Description This removes the requirement for pipelines having `Sort` operations to actually require `preserve_order=True`. This is an unnecessary strict requirement that has adversarial side-effects, and is strictly not required as there's no global ordering between the blocks established. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <[email protected]> Signed-off-by: Adel Nour <[email protected]>
## Description This removes the requirement for pipelines having `Sort` operations to actually require `preserve_order=True`. This is an unnecessary strict requirement that has adversarial side-effects, and is strictly not required as there's no global ordering between the blocks established. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <[email protected]> Signed-off-by: peterxcli <[email protected]>
## Description This removes the requirement for pipelines having `Sort` operations to actually require `preserve_order=True`. This is an unnecessary strict requirement that has adversarial side-effects, and is strictly not required as there's no global ordering between the blocks established. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <[email protected]> Signed-off-by: peterxcli <[email protected]>
Description
This removes the requirement for pipelines having
Sortoperations to actually requirepreserve_order=True.This is an unnecessary strict requirement that has adversarial side-effects, and is strictly not required as there's no global ordering between the blocks established.
Related issues
Additional information