Skip to content

Fix #245: [Model] StackerCrane#730

Merged
GiggleLiu merged 5 commits intomainfrom
issue-245
Mar 21, 2026
Merged

Fix #245: [Model] StackerCrane#730
GiggleLiu merged 5 commits intomainfrom
issue-245

Conversation

@GiggleLiu
Copy link
Copy Markdown
Contributor

Summary

Fixes #245

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 21, 2026

Codecov Report

❌ Patch coverage is 95.37954% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.58%. Comparing base (a01930d) to head (e938f2d).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/models/misc/stacker_crane.rs 92.55% 14 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #730      +/-   ##
==========================================
- Coverage   97.60%   97.58%   -0.03%     
==========================================
  Files         409      411       +2     
  Lines       50893    51196     +303     
==========================================
+ Hits        49676    49958     +282     
- Misses       1217     1238      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@GiggleLiu
Copy link
Copy Markdown
Contributor Author

Implementation Summary

Changes

  • Added the new StackerCrane model in src/models/misc/stacker_crane.rs, including validation, evaluation via shortest connector paths in the mixed graph, registry metadata, and a canonical example.
  • Wired StackerCrane into the library exports and example database, with unit tests covering construction, serialization, the issue witness, and the global B = 19 unsatisfiability claim.
  • Extended pred create support for StackerCrane, including examples, validation tests, and schema-driven help coverage for the documented --arcs, --graph, --arc-costs, --edge-lengths, and --bound flags.
  • Added a paper entry and bibliography references for Stacker Crane in docs/paper/reductions.typ and docs/paper/references.bib.

Deviations from Plan

  • During verification, the schema-driven pred create StackerCrane help path was still using generic flag names (--biedges, --arc-lengths, --edge-weights). I fixed that by adding StackerCrane-specific help-flag overrides and an integration test for the real CLI help path.
  • I also strengthened the test coverage beyond the original witness check by brute-forcing the issue instance at B = 19 to confirm it is globally unsatisfiable, as stated in issue [Model] MinimumStackerCrane #245.

Open Questions

  • No associated rule issue exists yet, so this model still enters the catalog as an orphan until a future reduction is added.

@GiggleLiu
Copy link
Copy Markdown
Contributor Author

Agentic Review Report

Structural Check

Structural Review: model StackerCrane

Structural Completeness

# Check Status
1 Model file exists PASS
2 inventory::submit! present PASS
3 Serialize / Deserialize derive on struct PASS
4 Problem trait impl PASS
5 SatisfactionProblem / OptimizationProblem impl PASS
6 #[cfg(test)] + #[path = ...] test link PASS
7 Test file exists PASS
8 Test file has >= 3 test functions PASS
9 Registered in category mod.rs PASS
10 Re-exported in models/mod.rs PASS
11 declare_variants! entry exists PASS
12 CLI resolve_alias entry PASS — generic registry-based resolution in problemreductions-cli/src/problem_name.rs:16 means no per-model arm is required
13 CLI create support PASS
14 Canonical model example registered PASS — wired through src/models/misc/mod.rs:103 and covered by src/unit_tests/example_db.rs:84
15 Paper display-name entry PASS
16 Paper problem-def block PASS
17 Blacklisted auto-generated files absent from diff PASS
18 Deterministic whitelist check FAIL — supplied implementation packet reports Whitelist: fail
19 Deterministic completeness check PASS — supplied implementation packet reports Completeness: pass

Build Status

  • make test: PASS
  • make clippy: PASS

Semantic Review

  • evaluate(): OK — the implemented decision model is internally coherent: non-permutations, unreachable connectors, and overflow are rejected before the bound check in src/models/misc/stacker_crane.rs:227.
  • dims(): OK for the implemented encoding — [num_arcs; num_arcs] enumerates arc-order candidates and evaluate() filters to valid permutations in src/models/misc/stacker_crane.rs:268.
  • Edge cases: ISSUE — try_new() allows num_vertices == 0, and then dims() is empty while closed_walk_length([]) returns Some(0), so the empty mixed graph is accepted as satisfiable even though there is no start vertex for a closed walk in src/models/misc/stacker_crane.rs:76 and src/config.rs:44.
  • Size getter consistency: OK — num_vertices, num_arcs, and num_edges align with the declared complexity fields in src/models/misc/stacker_crane.rs:121 and src/models/misc/stacker_crane.rs:279.
  • Brute-force assumptions: OK for the current encoding — brute force is sound over arc-order tuples, but it is a different search space from the linked issue’s walk-sequence model.
  • Complexity metadata: ISSUE — the model declares num_vertices * 2^num_arcs in src/models/misc/stacker_crane.rs:279, which diverges from the linked issue’s num_vertices^2 * 2^num_arcs.

Issue Compliance

# Check Status
1 Problem name matches issue ISSUE — linked issue specifies MinimumStackerCrane, but the PR registers StackerCrane in src/models/misc/stacker_crane.rs:15
2 Mathematical definition matches ISSUE — the issue defines a minimum-length closed-walk optimization problem, while the PR implements a bound-satisfaction problem over arc-order permutations in src/models/misc/stacker_crane.rs:34
3 Problem type (opt/sat) matches ISSUE — issue requires OptimizationProblem / SolutionSize<i32>, but code implements SatisfactionProblem<bool> in src/models/misc/stacker_crane.rs:260
4 Type parameters match OK — both are unparameterized mixed-graph models
5 Configuration space matches ISSUE — issue requires [num_vertices; num_arcs + num_edges], code uses [num_arcs; num_arcs] in src/models/misc/stacker_crane.rs:268
6 Feasibility check matches ISSUE — issue expects validation of an explicit walk sequence, code instead evaluates shortest connector paths induced by an arc permutation in src/models/misc/stacker_crane.rs:231
7 Objective function matches ISSUE — issue expects SolutionSize::Valid(total_length) / Invalid, code returns only a boolean bound check in src/models/misc/stacker_crane.rs:272
8 Complexity matches ISSUE — issue specifies num_vertices^2 * 2^num_arcs, while the PR declares num_vertices * 2^num_arcs in src/models/misc/stacker_crane.rs:279 and repeats that in docs/paper/reductions.typ:3592

Summary

  • 18/19 structural checks passed.
  • 1/8 issue compliance checks passed.
  • FAIL — supplied review packet reports deterministic whitelist failure.
  • ISSUE — the PR is internally consistent as the decision problem StackerCrane, but it does not comply with the saved linked-issue spec for MinimumStackerCrane.
  • ISSUE — empty mixed-graph instances with num_vertices == 0 are accepted as satisfiable.
  • ISSUE — complexity metadata does not match the linked issue and is not justified by an implemented exact algorithm.

Quality Check

Quality Review

Design Principles

  • DRY: ISSUE — the StackerCrane CLI contract is duplicated across the after-help table, example text, usage text, and parser defaults, and those copies already disagree about whether --arc-costs / --edge-lengths are required. See problemreductions-cli/src/cli.rs:269, problemreductions-cli/src/commands/create.rs:590, problemreductions-cli/src/commands/create.rs:1508, problemreductions-cli/src/commands/create.rs:3947, and problemreductions-cli/src/commands/create.rs:4836.
  • KISS: ISSUE — the create path composes parse_directed_graph and parse_graph, then rejects mismatched inferred vertex counts in problemreductions-cli/src/commands/create.rs:1513. A single mixed-graph vertex-count resolution would be simpler and would avoid the current valid-input failure mode.
  • HC/LC: ISSUE — the StackerCrane branch is tightly coupled to the inference behavior of two unrelated helpers instead of owning one mixed-graph input contract. That coupling is what makes otherwise valid inputs fail in problemreductions-cli/src/commands/create.rs:1520.

HCI

  • Error messages: ISSUE — the failure in problemreductions-cli/src/commands/create.rs:1520 says the directed and undirected inputs must “agree on --num-vertices”, but the real problem is that the CLI does not infer a combined vertex count from both sides.
  • Discoverability: ISSUE — the help line marks --num-vertices as optional in problemreductions-cli/src/cli.rs:269, but omission only works when arcs and edges happen to mention the same maximum vertex index.
  • Consistency: ISSUE — help/usage present --arc-costs and --edge-lengths as required in problemreductions-cli/src/cli.rs:269 and problemreductions-cli/src/commands/create.rs:1508, while the implementation silently defaults them in problemreductions-cli/src/commands/create.rs:3947 and problemreductions-cli/src/commands/create.rs:4836.
  • Least surprise: ISSUE — pred create StackerCrane --arcs "5>0" --graph "0-1" --arc-costs 1 --edge-lengths 1 --bound 3 fails today unless --num-vertices 6 is added, even though the mixed input already determines that size.
  • Feedback: OK — successful creation emits the expected JSON shape, as covered in problemreductions-cli/src/commands/create.rs:5994.

Test Quality

  • Naive test detection: ISSUE.
  • problemreductions-cli/tests/cli_tests.rs:136 only checks that help text contains flag names; it does not verify actual optional/required semantics, so the --arc-costs / --edge-lengths drift is invisible to tests.
  • problemreductions-cli/src/commands/create.rs:5994 covers one symmetric happy path and two parse errors, but it does not cover the valid asymmetric inference path where arcs and edges touch different highest-numbered vertices, which is where the new create logic breaks.
  • src/unit_tests/models/misc/stacker_crane.rs:27 reuses one canonical fixture for most assertions and never exercises constructor rejection or unreachable-connector branches in src/models/misc/stacker_crane.rs:76 and src/models/misc/stacker_crane.rs:192.

Issues

Critical (Must Fix)

  • None.

Important (Should Fix)

  • Valid StackerCrane instances are rejected unless users manually provide --num-vertices whenever the arc list and edge list infer different maxima. The bug is in the interaction between problemreductions-cli/src/commands/create.rs:1513, problemreductions-cli/src/commands/create.rs:3854, and problemreductions-cli/src/commands/create.rs:4772.
  • The added CLI tests miss that failing inference path, so the new user-facing bug slips through despite dedicated coverage. See problemreductions-cli/tests/cli_tests.rs:136 and problemreductions-cli/src/commands/create.rs:5994.

Minor (Nice to Have)

  • The help contract is internally inconsistent: problemreductions-cli/src/cli.rs:269 and problemreductions-cli/src/commands/create.rs:1508 imply required length flags, while problemreductions-cli/src/commands/create.rs:3947 and problemreductions-cli/src/commands/create.rs:4836 make them optional.

Summary

  • Important: pred create StackerCrane rejects valid mixed-graph inputs when arcs and edges infer different vertex counts unless --num-vertices is added manually.
  • Important: the new tests do not cover that inference path, so the main CLI bug is not being exercised.
  • Minor: the StackerCrane help/usage text disagrees with the implementation about whether --arc-costs and --edge-lengths are required.

Agentic Feature Tests

Verdict

partial fail for the requested end-to-end CLI path. pred list, pred show StackerCrane, and pred create --example StackerCrane work. Solving the shipped example works only with --solver brute-force; plain pred solve <instance> fails.

Test Results

  • cargo build -p problemreductions-cli --bin pred: passed.
  • ./target/debug/pred list: passed. Catalog includes StackerCrane * with complexity O(num_vertices * 2^num_arcs).
  • ./target/debug/pred show StackerCrane: partial pass. Description, complexity, fields, and zero incoming/outgoing reductions display correctly.
  • ./target/debug/pred create --example StackerCrane -o /tmp/.../stacker-crane.json: passed. Output JSON has "type": "StackerCrane" and the shipped canonical 6-vertex example.
  • ./target/debug/pred solve /tmp/.../stacker-crane.json: failed. Error: No reduction path from StackerCrane to ILP or ILP solver found no solution. Try --solver brute-force.
  • ./target/debug/pred solve /tmp/.../stacker-crane.json --solver brute-force: passed. Returned solution [0,2,1,4,3] with evaluation true.
  • ./target/debug/pred evaluate /tmp/.../stacker-crane.json --config 0,2,1,4,3: passed. Returned true.
  • Naming-friction check: ./target/debug/pred show MinimumStackerCrane and ./target/debug/pred create --example MinimumStackerCrane both fail with Unknown problem, while the linked issue still uses MinimumStackerCrane.

Findings

  1. Default pred solve <instance> does not work for the shipped StackerCrane example.
    Status: confirmed
    Severity: medium
    Reproduction: create the example, then run plain pred solve on it; it fails because StackerCrane has no ILP reduction path. The same instance succeeds with --solver brute-force.
    Recommended fix: either make plain pred solve fall back to brute-force when ILP is unavailable but brute-force is supported, or surface the brute-force requirement in the primary discovery path for this model so the documented happy path matches actual behavior.

  2. pred show StackerCrane omits size-field metadata.
    Status: confirmed
    Severity: low
    Reproduction: pred show StackerCrane --json returns "size_fields": [], and human-readable pred show StackerCrane shows no Size fields section.
    Recommended fix: register size-field metadata in src/models/misc/stacker_crane.rs for at least num_vertices and num_arcs, and likely num_edges, so the CLI can display them.

  3. Linked-issue naming still points users to MinimumStackerCrane, which the CLI does not recognize.
    Status: confirmed
    Severity: low
    Reproduction: pred show MinimumStackerCrane and pred create --example MinimumStackerCrane both return Unknown problem.
    Recommended fix: update the linked issue title/metadata and generated review context to use StackerCrane, matching the shipped model name and the CLI catalog.


Generated by review-pipeline

GiggleLiu and others added 2 commits March 21, 2026 22:37
# Conflicts:
#	docs/paper/reductions.typ
#	src/lib.rs
- Change complexity from num_vertices * 2^num_arcs to
  num_vertices^2 * 2^num_arcs (total DP time, not just state space)
- Update paper text to say "total time" instead of "state space"
- Add tests for try_new() validation errors, unreachable connectors,
  and invalid deserialization to bring coverage above 95%

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@GiggleLiu
Copy link
Copy Markdown
Contributor Author

Follow-up items (recorded during final review):

@GiggleLiu GiggleLiu mentioned this pull request Mar 21, 2026
3 tasks
@GiggleLiu GiggleLiu merged commit 51e49f5 into main Mar 21, 2026
3 checks passed
@GiggleLiu GiggleLiu deleted the issue-245 branch April 12, 2026 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Model] MinimumStackerCrane

1 participant