fix: evaluate expressions in WHERE, UPDATE, and UPSERT; wire ALTER COLLECTION column ops#26
Merged
farhan-syah merged 6 commits intomainfrom Apr 14, 2026
Merged
Conversation
…dules Move the monolithic `expr.rs` into an `expr/` directory (eval, codec, binary, types submodules) so each concern lives in its own file. Extract `FilterOp` and the `ScanFilter`/`CompareValue` types from `scan_filter/mod.rs` into `op.rs` and `types.rs` respectively, leaving `mod.rs` as pure re-exports. Add `like` and `parse` as explicit submodule declarations now that the types they depend on are in separate files. `msgpack_scan/filter.rs` updated to import from the new paths.
…ssion evaluation Introduce `UpdateValue` on the bridge — a tagged union carrying either pre-encoded msgpack bytes (fast literal path) or a `SqlExpr` that the Data Plane evaluates against the existing row at apply time. Planner changes: - `SqlPlan::Upsert` gains an `on_conflict_updates` field for the SET assignments extracted from `ON CONFLICT (...) DO UPDATE SET`. - `dml::plan_upsert_with_on_conflict` routes `INSERT ... ON CONFLICT DO UPDATE SET` through the upsert path, carrying the assignments. - `assignments_to_update_values` converts planner `SqlExpr` to `UpdateValue`, choosing `Literal` for constant RHS and `Expr` for anything that must be evaluated per-row. - `ScanFilter` gains an `expr: Option<SqlExpr>` field so complex WHERE clauses (scalar functions, arithmetic, non-literal BETWEEN) that the planner cannot reduce to simple triples are shipped verbatim to the Data Plane for row-level evaluation. - `sql_expr_to_bridge_expr` extended to cover `UnaryOp`, `IsNull`, `Between`, `Substring`, `Trim`, and `Case` expressions. - `sql_plan_convert/filter.rs` populates the new `expr` field for general-purpose `FilterExpr::Expr` predicates. Executor changes: - `bulk_dml` and `upsert` handlers receive `&[(String, UpdateValue)]` instead of `&[(String, Vec<u8>)]`; literal arms are identical to the previous fast path, expression arms evaluate against a pre-update snapshot of the row (matching PostgreSQL semantics — assignments do not observe each other). - Empty `filter_bytes` in `bulk_dml` now means "no WHERE clause" instead of a deserialization error. - `check_generated_readonly` and `needs_recomputation` generified over the update-value type so both call sites compile without duplication. All existing call sites updated to pass `expr: None` on the `ScanFilter` constructor and supply the required `on_conflict_updates` slice.
value.rs (509 lines) → value/ with assignments, convert, defaults, msgpack_write, rows, and time_range submodules. alter.rs (441 lines) → alter/ with add_column, alter_type, drop_column, enforcement, materialized_sum, and rename_column submodules. DDL router schema.rs gains explicit routing for ADD COLUMN, DROP COLUMN, and RENAME COLUMN so requests reach the handlers in the new module layout. point.rs (696 lines) → point/ with apply_put, delete, get, put, and update submodules. All files remain under 200 lines. No behaviour changes.
…TE and WHERE sql_update_expressions.rs — covers UPDATE statements where the RHS is a non-literal expression (column arithmetic, scalar functions, NOW()). These catch the silent corruption bug where expression RHS was serialized as a debug string instead of being evaluated against the current row. sql_where_expressions.rs — covers WHERE clauses containing scalar function calls (LOWER, UPPER, LENGTH, arithmetic). These must be evaluated per-row by the scan filter, not silently dropped through a match-all fall-through. sql_transactions.rs — covers basic transaction semantics: COMMIT persists buffered writes and ROLLBACK discards them. Existing executor tests updated to pass the new `on_conflict_updates` and `expr` fields added to the physical plan types.
jieba-rs 0.8 pulled in include-flate → libflate → core2 0.4.0, which was yanked from crates.io and caused CI to fail with a version-yanked error during cargo clippy. jieba-rs 0.9 removes the libflate dependency entirely, eliminating core2 from the transitive graph.
core2 0.4.0 is yanked on crates.io and no 0.4.1+ release exists. The libflate / libflate_lz77 chain (pulled in transitively via nodedb-fts' optional jieba feature → include-flate → include-flate-compress) still requires core2 ^0.4, so any fresh dependency resolution — such as CI running without a committed lockfile — fails with "version 0.4.0 is yanked". Because NodeDB is a library workspace (nodedb-fts and others are published to crates.io), committing Cargo.lock is not appropriate. Patching directly to the upstream technocreatives/core2 git repo at the commit that corresponds to 0.4.0 lets cargo bypass the registry yank check while building bit-identical code. Verified: `cargo check --workspace --all-features` and `cargo clippy --workspace --all-targets --all-features -- -D warnings` both succeed with a fresh lockfile. Remove this patch once libflate releases a version that drops the core2 dependency.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes three reported silent-correctness bugs (#22, #23, #24) and expands coverage to the full class of failures they represent. The core change is wiring the existing
nodedb_query::expr::SqlExprevaluator into every place the planner previously gave up on non-trivial expressions.What was broken
WHERE(LOWER(col) = 'x',qty + 1 = 5,NOT (...),IN (2+1, 7),BETWEEN lo AND hi) silently fell through toFilterOp::MatchAll.DELETE WHERE LOWER(...)would wipe the table;UPDATE WHERE LOWER(...)would touch every row.UPDATE t SET n = n + 1(andNOW(),UPPER(col),CASE, concat,LEAST(...)) wroteformat!(\"{expr:?}\")into the column instead of evaluating. Strict errored on re-encode; schemaless silently stored the debug AST as a string.ALTER COLLECTION x ADD COLUMN y ...(and docs-advertisedDROP COLUMN/RENAME COLUMN/ALTER COLUMN TYPE) were rejected at parse time because the pgwire DDL router had no branch for them.What changed
Scan filter path (
nodedb-query/src/scan_filter,nodedb/src/control/planner/sql_plan_convert/filter.rs)FilterOp::Exprvariant carrying aSqlExprpayload onScanFilter.expr_filter(root)for any WHERE expression it cannot reduce to(field, op, value)instead of falling through to match-all. Every fall-through insql_expr_to_scan_filtersnow routes through the evaluator.matches_value,matches_binary, andmatches_binary_indexedall evaluateFilterOp::Expragainst the row viaSqlExpr::eval.NOT,BETWEEN,InList,Like,IsNull) into the evaluator's equivalent AST nodes duringsql_expr_to_bridge_expr.filter_bytesas "match every row" instead of a zerompk decode error.UPDATE assignment path (
bridge::physical_plan::UpdateValue,handlers/point/update.rs,handlers/bulk_dml.rs)UpdateValue::{Literal(Vec<u8>), Expr(SqlExpr)}enum carrying both pre-encoded literals and unevaluated expressions across the SPSC bridge.PointUpdate.updatesandBulkUpdate.updatestakeVec<(String, UpdateValue)>.assignments_to_update_values) converts non-literal RHS viasql_expr_to_bridge_expr; KVFieldSetnow errors loudly on non-literal RHS instead of silently dropping fields.Exprassignment against that snapshot (PostgreSQL-style single-pass semantics), and emits the result into the doc.check_generated_readonly/needs_recomputationhelpers made generic over the payload type so literal and expression updates share one implementation.UPSERT ON CONFLICT DO UPDATE (
plan_insert,SqlPlan::Upsert,DocumentOp::Upsert,handlers/upsert.rs)plan_insertdetectsON CONFLICT ... DO UPDATE SETin the AST and reroutes to a newplan_upsert_with_on_conflictpath that carries the assignments.UpsertParams,SqlPlan::Upsert, andDocumentOp::Upsertall gainon_conflict_updates: Vec<(String, UpdateValue)>.execute_upsertapplies those assignments against the existing row (evaluating expressions against a pre-update snapshot) instead of merging inserted fields.ALTER COLLECTION column ops (
pgwire/ddl/collection/alter/)ALTER COLLECTION ADD COLUMN,DROP COLUMN,RENAME COLUMN, andALTER COLUMN ... TYPE ....ADD COLUMNreuses the existing strict-schema handler.DROP/RENAME/ALTER TYPEare new handlers that mutateStrictSchema, bump version, propose via Raft, and refresh the Data Plane cache. Type changes are restricted to same-discriminant aliases (e.g.INT ↔ BIGINT); cross-type changes error loudly with an explanation rather than silently corrupting existing rows.Structural hygiene on the fix path
Four pre-existing size violations were on files this fix needed to touch. Each was split into a directory before new code was added:
sql_plan_convert/value.rsvalue/{mod,convert,msgpack_write,assignments,rows,defaults,time_range}.rs(max 152)nodedb-query/src/expr.rsexpr/{mod,types,codec,binary,eval}.rs(max 285)scan_filter/mod.rsscan_filter/{mod,op,types,like,parse}.rs(max 203)handlers/point.rspoint/{mod,get,put,delete,update,apply_put}.rs(max 265)collection/alter.rsalter/{mod,add_column,drop_column,rename_column,alter_type,enforcement,materialized_sum}.rsmod.rsfiles arepub mod+pub useonly per project convention.Shared evaluator hygiene (
nodedb-query/src/expr/eval.rs)eval()andeval_with_old()previously duplicated ~140 lines of identical AST walking with a different column resolver. Extracted a privateRowScopestruct and a sharedeval_scopewalker; public methods now just build a scope and delegate. One traversal, one place to fix..expect(\"list non-empty\")in theInListdesugaring with a.reduce(...).unwrap_or(...)that falls back to the already-computed empty-list sentinel.Test plan
All three originally-reported bugs have positive regression coverage asserting the correct behavior (not the current broken behavior). The coverage was widened along the same root-cause code paths — filter.rs fall-throughs, value.rs catch-all, router gaps — to catch every docs-advertised sibling of each bug.
nodedb/tests/sql_transactions.rs— ALTER ADD/DROP/RENAME/ALTER TYPE on strict collectionsnodedb/tests/sql_where_expressions.rs— LOWER/UPPER/NOT/BETWEEN/IN-with-expression/column-arithmetic in WHERE, including DELETE and UPDATE scopingnodedb/tests/sql_update_expressions.rs— column arithmetic, NOW(), UPPER, concat, CASE, LEAST on strict + schemaless, plus UPSERT DO UPDATE SETCloses