Fix #245: [Model] StackerCrane by GiggleLiu · Pull Request #730 · CodingThrust/problem-reductions

GiggleLiu · 2026-03-21T12:58:58Z

Summary

add an implementation plan for the StackerCrane model from issue [Model] MinimumStackerCrane #245
execute the plan in follow-up commits on this branch

Fixes #245

codecov · 2026-03-21T13:01:19Z

Codecov Report

❌ Patch coverage is 95.37954% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.58%. Comparing base (a01930d) to head (e938f2d).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/models/misc/stacker_crane.rs	92.55%	14 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #730      +/-   ##
==========================================
- Coverage   97.60%   97.58%   -0.03%     
==========================================
  Files         409      411       +2     
  Lines       50893    51196     +303     
==========================================
+ Hits        49676    49958     +282     
- Misses       1217     1238      +21

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

GiggleLiu · 2026-03-21T13:23:59Z

Implementation Summary

Changes

Added the new StackerCrane model in src/models/misc/stacker_crane.rs, including validation, evaluation via shortest connector paths in the mixed graph, registry metadata, and a canonical example.
Wired StackerCrane into the library exports and example database, with unit tests covering construction, serialization, the issue witness, and the global B = 19 unsatisfiability claim.
Extended pred create support for StackerCrane, including examples, validation tests, and schema-driven help coverage for the documented --arcs, --graph, --arc-costs, --edge-lengths, and --bound flags.
Added a paper entry and bibliography references for Stacker Crane in docs/paper/reductions.typ and docs/paper/references.bib.

Deviations from Plan

During verification, the schema-driven pred create StackerCrane help path was still using generic flag names (--biedges, --arc-lengths, --edge-weights). I fixed that by adding StackerCrane-specific help-flag overrides and an integration test for the real CLI help path.
I also strengthened the test coverage beyond the original witness check by brute-forcing the issue instance at B = 19 to confirm it is globally unsatisfiable, as stated in issue [Model] MinimumStackerCrane #245.

Open Questions

No associated rule issue exists yet, so this model still enters the catalog as an orphan until a future reduction is added.

GiggleLiu · 2026-03-21T14:02:46Z

Agentic Review Report

Structural Check

Structural Review: model StackerCrane

Structural Completeness

#	Check	Status
1	Model file exists	PASS
2	`inventory::submit!` present	PASS
3	`Serialize` / `Deserialize` derive on struct	PASS
4	`Problem` trait impl	PASS
5	`SatisfactionProblem` / `OptimizationProblem` impl	PASS
6	`#[cfg(test)]` + `#[path = ...]` test link	PASS
7	Test file exists	PASS
8	Test file has >= 3 test functions	PASS
9	Registered in category `mod.rs`	PASS
10	Re-exported in `models/mod.rs`	PASS
11	`declare_variants!` entry exists	PASS
12	CLI `resolve_alias` entry	PASS — generic registry-based resolution in `problemreductions-cli/src/problem_name.rs:16` means no per-model arm is required
13	CLI `create` support	PASS
14	Canonical model example registered	PASS — wired through `src/models/misc/mod.rs:103` and covered by `src/unit_tests/example_db.rs:84`
15	Paper `display-name` entry	PASS
16	Paper `problem-def` block	PASS
17	Blacklisted auto-generated files absent from diff	PASS
18	Deterministic whitelist check	FAIL — supplied implementation packet reports `Whitelist: fail`
19	Deterministic completeness check	PASS — supplied implementation packet reports `Completeness: pass`

Build Status

make test: PASS
make clippy: PASS

Semantic Review

evaluate(): OK — the implemented decision model is internally coherent: non-permutations, unreachable connectors, and overflow are rejected before the bound check in src/models/misc/stacker_crane.rs:227.
dims(): OK for the implemented encoding — [num_arcs; num_arcs] enumerates arc-order candidates and evaluate() filters to valid permutations in src/models/misc/stacker_crane.rs:268.
Edge cases: ISSUE — try_new() allows num_vertices == 0, and then dims() is empty while closed_walk_length([]) returns Some(0), so the empty mixed graph is accepted as satisfiable even though there is no start vertex for a closed walk in src/models/misc/stacker_crane.rs:76 and src/config.rs:44.
Size getter consistency: OK — num_vertices, num_arcs, and num_edges align with the declared complexity fields in src/models/misc/stacker_crane.rs:121 and src/models/misc/stacker_crane.rs:279.
Brute-force assumptions: OK for the current encoding — brute force is sound over arc-order tuples, but it is a different search space from the linked issue’s walk-sequence model.
Complexity metadata: ISSUE — the model declares num_vertices * 2^num_arcs in src/models/misc/stacker_crane.rs:279, which diverges from the linked issue’s num_vertices^2 * 2^num_arcs.

Issue Compliance

#	Check	Status
1	Problem name matches issue	ISSUE — linked issue specifies `MinimumStackerCrane`, but the PR registers `StackerCrane` in `src/models/misc/stacker_crane.rs:15`
2	Mathematical definition matches	ISSUE — the issue defines a minimum-length closed-walk optimization problem, while the PR implements a bound-satisfaction problem over arc-order permutations in `src/models/misc/stacker_crane.rs:34`
3	Problem type (opt/sat) matches	ISSUE — issue requires `OptimizationProblem` / `SolutionSize<i32>`, but code implements `SatisfactionProblem<bool>` in `src/models/misc/stacker_crane.rs:260`
4	Type parameters match	OK — both are unparameterized mixed-graph models
5	Configuration space matches	ISSUE — issue requires `[num_vertices; num_arcs + num_edges]`, code uses `[num_arcs; num_arcs]` in `src/models/misc/stacker_crane.rs:268`
6	Feasibility check matches	ISSUE — issue expects validation of an explicit walk sequence, code instead evaluates shortest connector paths induced by an arc permutation in `src/models/misc/stacker_crane.rs:231`
7	Objective function matches	ISSUE — issue expects `SolutionSize::Valid(total_length)` / `Invalid`, code returns only a boolean bound check in `src/models/misc/stacker_crane.rs:272`
8	Complexity matches	ISSUE — issue specifies `num_vertices^2 * 2^num_arcs`, while the PR declares `num_vertices * 2^num_arcs` in `src/models/misc/stacker_crane.rs:279` and repeats that in `docs/paper/reductions.typ:3592`

Summary

18/19 structural checks passed.
1/8 issue compliance checks passed.
FAIL — supplied review packet reports deterministic whitelist failure.
ISSUE — the PR is internally consistent as the decision problem StackerCrane, but it does not comply with the saved linked-issue spec for MinimumStackerCrane.
ISSUE — empty mixed-graph instances with num_vertices == 0 are accepted as satisfiable.
ISSUE — complexity metadata does not match the linked issue and is not justified by an implemented exact algorithm.

Quality Check

Quality Review

Design Principles

DRY: ISSUE — the StackerCrane CLI contract is duplicated across the after-help table, example text, usage text, and parser defaults, and those copies already disagree about whether --arc-costs / --edge-lengths are required. See problemreductions-cli/src/cli.rs:269, problemreductions-cli/src/commands/create.rs:590, problemreductions-cli/src/commands/create.rs:1508, problemreductions-cli/src/commands/create.rs:3947, and problemreductions-cli/src/commands/create.rs:4836.
KISS: ISSUE — the create path composes parse_directed_graph and parse_graph, then rejects mismatched inferred vertex counts in problemreductions-cli/src/commands/create.rs:1513. A single mixed-graph vertex-count resolution would be simpler and would avoid the current valid-input failure mode.
HC/LC: ISSUE — the StackerCrane branch is tightly coupled to the inference behavior of two unrelated helpers instead of owning one mixed-graph input contract. That coupling is what makes otherwise valid inputs fail in problemreductions-cli/src/commands/create.rs:1520.

HCI

Error messages: ISSUE — the failure in problemreductions-cli/src/commands/create.rs:1520 says the directed and undirected inputs must “agree on --num-vertices”, but the real problem is that the CLI does not infer a combined vertex count from both sides.
Discoverability: ISSUE — the help line marks --num-vertices as optional in problemreductions-cli/src/cli.rs:269, but omission only works when arcs and edges happen to mention the same maximum vertex index.
Consistency: ISSUE — help/usage present --arc-costs and --edge-lengths as required in problemreductions-cli/src/cli.rs:269 and problemreductions-cli/src/commands/create.rs:1508, while the implementation silently defaults them in problemreductions-cli/src/commands/create.rs:3947 and problemreductions-cli/src/commands/create.rs:4836.
Least surprise: ISSUE — pred create StackerCrane --arcs "5>0" --graph "0-1" --arc-costs 1 --edge-lengths 1 --bound 3 fails today unless --num-vertices 6 is added, even though the mixed input already determines that size.
Feedback: OK — successful creation emits the expected JSON shape, as covered in problemreductions-cli/src/commands/create.rs:5994.

Test Quality

Naive test detection: ISSUE.
problemreductions-cli/tests/cli_tests.rs:136 only checks that help text contains flag names; it does not verify actual optional/required semantics, so the --arc-costs / --edge-lengths drift is invisible to tests.
problemreductions-cli/src/commands/create.rs:5994 covers one symmetric happy path and two parse errors, but it does not cover the valid asymmetric inference path where arcs and edges touch different highest-numbered vertices, which is where the new create logic breaks.
src/unit_tests/models/misc/stacker_crane.rs:27 reuses one canonical fixture for most assertions and never exercises constructor rejection or unreachable-connector branches in src/models/misc/stacker_crane.rs:76 and src/models/misc/stacker_crane.rs:192.

Issues

Critical (Must Fix)

None.

Important (Should Fix)

Valid StackerCrane instances are rejected unless users manually provide --num-vertices whenever the arc list and edge list infer different maxima. The bug is in the interaction between problemreductions-cli/src/commands/create.rs:1513, problemreductions-cli/src/commands/create.rs:3854, and problemreductions-cli/src/commands/create.rs:4772.
The added CLI tests miss that failing inference path, so the new user-facing bug slips through despite dedicated coverage. See problemreductions-cli/tests/cli_tests.rs:136 and problemreductions-cli/src/commands/create.rs:5994.

Minor (Nice to Have)

The help contract is internally inconsistent: problemreductions-cli/src/cli.rs:269 and problemreductions-cli/src/commands/create.rs:1508 imply required length flags, while problemreductions-cli/src/commands/create.rs:3947 and problemreductions-cli/src/commands/create.rs:4836 make them optional.

Summary

Important: pred create StackerCrane rejects valid mixed-graph inputs when arcs and edges infer different vertex counts unless --num-vertices is added manually.
Important: the new tests do not cover that inference path, so the main CLI bug is not being exercised.
Minor: the StackerCrane help/usage text disagrees with the implementation about whether --arc-costs and --edge-lengths are required.

Agentic Feature Tests

Verdict

partial fail for the requested end-to-end CLI path. pred list, pred show StackerCrane, and pred create --example StackerCrane work. Solving the shipped example works only with --solver brute-force; plain pred solve <instance> fails.

Test Results

cargo build -p problemreductions-cli --bin pred: passed.
./target/debug/pred list: passed. Catalog includes StackerCrane * with complexity O(num_vertices * 2^num_arcs).
./target/debug/pred show StackerCrane: partial pass. Description, complexity, fields, and zero incoming/outgoing reductions display correctly.
./target/debug/pred create --example StackerCrane -o /tmp/.../stacker-crane.json: passed. Output JSON has "type": "StackerCrane" and the shipped canonical 6-vertex example.
./target/debug/pred solve /tmp/.../stacker-crane.json: failed. Error: No reduction path from StackerCrane to ILP or ILP solver found no solution. Try --solver brute-force.
./target/debug/pred solve /tmp/.../stacker-crane.json --solver brute-force: passed. Returned solution [0,2,1,4,3] with evaluation true.
./target/debug/pred evaluate /tmp/.../stacker-crane.json --config 0,2,1,4,3: passed. Returned true.
Naming-friction check: ./target/debug/pred show MinimumStackerCrane and ./target/debug/pred create --example MinimumStackerCrane both fail with Unknown problem, while the linked issue still uses MinimumStackerCrane.

Findings

Default pred solve <instance> does not work for the shipped StackerCrane example.
Status: confirmed
Severity: medium
Reproduction: create the example, then run plain pred solve on it; it fails because StackerCrane has no ILP reduction path. The same instance succeeds with --solver brute-force.
Recommended fix: either make plain pred solve fall back to brute-force when ILP is unavailable but brute-force is supported, or surface the brute-force requirement in the primary discovery path for this model so the documented happy path matches actual behavior.
pred show StackerCrane omits size-field metadata.
Status: confirmed
Severity: low
Reproduction: pred show StackerCrane --json returns "size_fields": [], and human-readable pred show StackerCrane shows no Size fields section.
Recommended fix: register size-field metadata in src/models/misc/stacker_crane.rs for at least num_vertices and num_arcs, and likely num_edges, so the CLI can display them.
Linked-issue naming still points users to MinimumStackerCrane, which the CLI does not recognize.
Status: confirmed
Severity: low
Reproduction: pred show MinimumStackerCrane and pred create --example MinimumStackerCrane both return Unknown problem.
Recommended fix: update the linked issue title/metadata and generated review context to use StackerCrane, matching the shipped model name and the CLI catalog.

Generated by review-pipeline

# Conflicts: # docs/paper/reductions.typ # src/lib.rs

- Change complexity from num_vertices * 2^num_arcs to num_vertices^2 * 2^num_arcs (total DP time, not just state space) - Update paper text to say "total time" instead of "state space" - Add tests for try_new() validation errors, unreachable connectors, and invalid deserialization to bring coverage above 95% Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

GiggleLiu · 2026-03-21T14:51:25Z

Follow-up items (recorded during final review):

Update issue [Rule] HamiltonianCircuit to MinimumStackerCrane #261 title from "HamiltonianCircuit to MinimumStackerCrane" to "HamiltonianCircuit to StackerCrane" to match the shipped model name
Update issue [Model] MinimumStackerCrane #245 title from "[Model] MinimumStackerCrane" to "[Model] StackerCrane" for consistency

Add plan for #245: [Model] StackerCrane

d8cfc5b

GiggleLiu added 2 commits March 21, 2026 21:23

Fix #245: add StackerCrane model

6ad7cf1

chore: remove plan file after implementation

2545611

GiggleLiu and others added 2 commits March 21, 2026 22:37

Merge remote-tracking branch 'origin/main' into issue-245

c7c1bd6

# Conflicts: # docs/paper/reductions.typ # src/lib.rs

GiggleLiu mentioned this pull request Mar 21, 2026

[Model] MinimumStackerCrane #245

Closed

3 tasks

GiggleLiu merged commit 51e49f5 into main Mar 21, 2026
3 checks passed

GiggleLiu deleted the issue-245 branch April 12, 2026 00:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #245: [Model] StackerCrane#730

Fix #245: [Model] StackerCrane#730
GiggleLiu merged 5 commits intomainfrom
issue-245

GiggleLiu commented Mar 21, 2026

Uh oh!

codecov bot commented Mar 21, 2026 •

edited

Loading

Uh oh!

GiggleLiu commented Mar 21, 2026

Uh oh!

GiggleLiu commented Mar 21, 2026

Uh oh!

GiggleLiu commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

GiggleLiu commented Mar 21, 2026

Summary

Uh oh!

codecov bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

GiggleLiu commented Mar 21, 2026

Implementation Summary

Changes

Deviations from Plan

Open Questions

Uh oh!

GiggleLiu commented Mar 21, 2026

Agentic Review Report

Structural Check

Structural Review: model StackerCrane

Structural Completeness

Build Status

Semantic Review

Issue Compliance

Summary

Quality Check

Quality Review

Design Principles

HCI

Test Quality

Issues

Critical (Must Fix)

Important (Should Fix)

Minor (Nice to Have)

Summary

Agentic Feature Tests

Uh oh!

GiggleLiu commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Mar 21, 2026 •

edited

Loading