Add local AI review skill (/ai-review-local) by igerber · Pull Request #227 · igerber/diff-diff

igerber · 2026-03-21T22:37:15Z

Summary

Add /ai-review-local skill that calls the OpenAI Chat Completions API (gpt-5.4) to review code changes locally before opening a PR
Add .claude/scripts/openai_review.py — standalone Python script (stdlib only) that compiles the review prompt, selectively extracts REGISTRY.md sections via prefix-based matching, and calls the API
Enables iterative review/revision cycles with Claude Code to catch methodology, edge-case, and pattern issues before CI review
Supports re-review mode (automatically includes previous review for follow-up), --full-registry, --dry-run, and --model override
Update .gitignore to allow .claude/scripts/ to be committed
Update CLAUDE.md workflow to include /ai-review-local in pre-submission steps

Methodology references (required if estimator / math changes)

Method name(s): N/A — no methodology changes
Paper / source link(s): N/A
Any intentional deviations from the source (and why): None

Validation

Tests added/updated: No test changes (tooling/infrastructure only)
Dry-run verified: prompt compilation, REGISTRY.md section extraction, re-review mode
Live tested: end-to-end API call with gpt-5.4, re-review cycle confirmed previous findings tracked
Backtest / simulation / notebook evidence (if applicable): N/A

Security / privacy

Confirm no secrets/PII in this PR: Yes
Note: The script reads OPENAI_API_KEY from the environment and never logs it. Diffs and selected registry text are transmitted to OpenAI when not using --dry-run.

Generated with Claude Code

New skill that calls the OpenAI Chat Completions API to review code changes locally before opening a PR. Enables iterative review/revision cycles with Claude Code to catch methodology, edge-case, and pattern issues before CI review. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

- Fix missing f-string on timeout error message - Clarify docs: skill creates review artifacts, may auto-commit - Step 7: offer to address findings at any severity, not just P0/P1 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

github-actions · 2026-03-21T22:46:40Z

PR Review

Overall assessment: ⛔ Blocker

Executive Summary

No estimator, weighting, variance/SE, or identification logic changed in this PR, so the core Methodology Registry deviation rules are not directly implicated.
Severity P0: /ai-review-local uploads the branch diff to OpenAI but does not run the repo’s canonical secret scan first, creating a new accidental-secret exfiltration path.
Severity P1: base-branch detection uses @{upstream} as if it were the PR base; on a normal pushed feature branch that commonly resolves to the current branch name, so the generated diff can be empty.
Severity P1: CLAUDE.md now tells users to run /ai-review-local before /pre-merge-check, but /ai-review-local may auto-commit everything and /pre-merge-check only inspects uncommitted/staged/untracked changes, so the second safeguard can silently do nothing.

Methodology

No estimator/math/SE changes were introduced. One tooling issue still affects methodology review coverage:

Severity P2 — Selective registry extraction misses visualization submodules. Impact: ._needed_sections() in .claude/scripts/openai_review.py:L64-L75 strips diff_diff/visualization/_event_study.py down to _event_study.py, so the existing "visualization" mapping in .claude/scripts/openai_review.py:L21-L50 never fires. In default selective mode, local reviews of plotting changes omit the Event Study Plotting registry section. Concrete fix: match on repo-relative path (or directory + basename) instead of basename only, and add a regression test for diff_diff/visualization/_event_study.py.

Code Quality

I did not treat use of v1/chat/completions itself as a defect; Chat Completions remains a documented OpenAI API endpoint. (platform.openai.com)

Severity P1 — Incorrect comparison base in .claude/commands/ai-review-local.md:L66-L97. Impact: git rev-parse --abbrev-ref @{upstream} | sed 's|origin/||' turns a common upstream like origin/feature/foo into feature/foo, and the later git diff "${base_branch}...HEAD" then compares the branch to itself. Users on a pushed feature branch can get a false “No committed changes” result even when there are commits to review. The repo already uses a safer comparison-ref pattern in .claude/commands/push-pr-update.md:L57-L83. Concrete fix: resolve a real comparison ref from the repo default branch / PR base, or use @{u}..HEAD only when you intentionally mean “ahead of upstream”; add regression coverage for a branch tracking origin/<feature>.

Performance

No material performance findings in the diff.

Maintainability

Severity P1 — The new documented workflow disables the next safeguard. Impact: CLAUDE.md now says to run /ai-review-local before /pre-merge-check in .CLAUDE.md:L146-L150, but /ai-review-local may auto-commit all changes in .claude/commands/ai-review-local.md:L77-L92, while /pre-merge-check only inspects git diff HEAD, git diff --cached, and untracked files in .claude/commands/pre-merge-check.md:L12-L20. Following the documented sequence can therefore make /pre-merge-check inspect an empty working tree. Concrete fix: either make /ai-review-local read-only/non-committing, or move /pre-merge-check back before it and teach /pre-merge-check to handle committed diff ranges.
Severity P3 — Dead commit-log path / misleading prompt adaptation. Impact: Step 3 writes /tmp/ai-review-commits.txt in .claude/commands/ai-review-local.md:L87-L92, and _adapt_review_criteria() says commit messages are used in .claude/scripts/openai_review.py:L158-L163, but compile_prompt() only injects branch_info in .claude/scripts/openai_review.py:L206-L214. Concrete fix: either remove the unused commit-log artifact and wording, or pass commit messages explicitly and delimit them as untrusted context.

Tech Debt

No relevant mitigation was added in TODO.md, and none of the P0/P1 issues above are tracked there, so they remain unmitigated.

Security

Severity P0 — No secret scan before external API upload. Impact: /ai-review-local stages/commits all local changes in .claude/commands/ai-review-local.md:L77-L90, then passes the unified diff to .claude/scripts/openai_review.py in .claude/commands/ai-review-local.md:L117-L127. The script appends that diff to the prompt in .claude/scripts/openai_review.py:L206-L214 and posts it to OpenAI in .claude/scripts/openai_review.py:L234-L252 and .claude/scripts/openai_review.py:L460-L466. Unlike /submit-pr and /push-pr-update, which explicitly run the canonical secret scan from .claude/commands/pre-merge-check.md:L126-L140 before proceeding in .claude/commands/submit-pr.md:L169-L201 and .claude/commands/push-pr-update.md:L90-L110, this new path uploads unscanned diffs. Concrete fix: run the same filename/content secret scan on the exact diff range that will be sent to OpenAI, show only filenames, and abort or require explicit confirmation before the API call.

Documentation/Tests

Severity P3 — The command documentation contradicts actual side effects. Impact: .claude/commands/ai-review-local.md:L19-L23 says steps 1-4 are read-only and no project files are modified, but the command creates .claude/reviews/ in .claude/commands/ai-review-local.md:L59-L62, may commit in .claude/commands/ai-review-local.md:L77-L85, copies review files in .claude/commands/ai-review-local.md:L103-L108, writes a new review artifact in .claude/commands/ai-review-local.md:L117-L123, and even lists “User aborts due to uncommitted changes” in .claude/commands/ai-review-local.md:L195-L202 without an actual abort path. Concrete fix: rewrite the constraints/error-handling text to describe the real side effects and add an explicit confirm/abort step before any commit.
Severity P3 — No automated coverage for the new script/command. Impact: there are no tests referencing openai_review or ai-review-local, and the branch-resolution and registry-path bugs above are exactly the sort of regressions lightweight unit tests would catch. Concrete fix: add tests for _needed_sections() on nested visualization paths and for comparison-ref selection on a feature branch tracking origin/<feature>.

Path to Approval

Add a pre-upload secret scan using the canonical /pre-merge-check patterns against the exact diff range to be sent to OpenAI, and abort or require explicit confirmation on matches.
Fix comparison-ref selection so local review compares against the PR base/default branch rather than the current branch’s upstream name; add a regression test for a pushed feature branch tracking origin/<feature>.
Remove the workflow contradiction between /ai-review-local and /pre-merge-check: either make /ai-review-local non-committing/read-only, or document/run /pre-merge-check first and extend it to inspect committed diff ranges.

- P0: Add secret scan (canonical patterns) before API upload - P1: Fix base branch detection using gh repo view instead of @{upstream} - P1: Swap workflow order: /pre-merge-check before /ai-review-local - P2: Fix visualization submodule path matching in _needed_sections() - P3: Remove dead commit-log artifact, fix constraints section wording, remove misleading commit-message text from prompt adaptation Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

- Step 7: show full findings summary grouped by severity before prompting - Add data-transmission privacy note to skill docs - Add maintenance comment on PREFIX_TO_SECTIONS mapping Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

- Add 29 unit tests for openai_review.py (registry extraction, path matching, prompt adaptation, compilation, mapping coverage) - Validate prompt substitutions warn on source drift - Document skill/script contract in module docstring - Add maintenance comment on PREFIX_TO_SECTIONS mapping Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

igerber · 2026-03-21T23:18:44Z

/ai-review

github-actions · 2026-03-21T23:25:42Z

🔁 AI review rerun (requested by @igerber)

Head SHA: a15caf369aa758bb28417d4df4f664bef7c77733

Overall assessment: ⚠️ Needs changes

Executive summary

No estimator, weighting, variance/SE, identification, or default-behavior code changed here, so I found no methodology defects in the library itself.
The prior P0 about uploading an unscanned diff is addressed: Step 3b now adds a pre-upload secret scan in .claude/commands/ai-review-local.md:L104-L129.
The prior registry-selection gap for visualization files also looks fixed: _needed_sections() now maps diff_diff/visualization/*, with regression coverage in tests/test_openai_review.py:L65-L76 and tests/test_openai_review.py:L283-L292.
Severity P1: /ai-review-local still assumes the repo default branch exists as a local ref; on clones that only have origin/<default-branch>, the diff/scan path can fail or short-circuit instead of reviewing the branch.
Severity P1: the command still tells users to run /pre-merge-check after /ai-review-local, even though /ai-review-local may auto-commit first and /pre-merge-check only inspects working-tree changes.

Methodology
No findings. No estimator/math/SE/assumption changes were introduced. The methodology-review helper’s prior visualization omission appears fixed in .claude/scripts/openai_review.py:L85-L103.

Code Quality

Severity P1 — Incomplete comparison-ref resolution in .claude/commands/ai-review-local.md:L71-L114. Impact: the command records only the default branch name, then runs git diff "${base_branch}...HEAD" directly. Unlike the existing safer pattern in .claude/commands/push-pr-update.md:L63-L67, it never falls back to origin/<default-branch> or fetches the ref first. On shallow/single-branch clones, that can make diff generation fail; Step 3b suppresses diff errors, and the Python script treats an empty diff file as “No changes to review” in .claude/scripts/openai_review.py:L439-L442. Concrete fix: resolve a validated <comparison-ref> exactly as /push-pr-update does, reuse it for both diff generation and secret scanning, and fail on non-zero git diff instead of treating the result as an empty diff.

Performance
No findings.

Maintainability

Severity P1 — The post-review guidance still points users at /pre-merge-check after /ai-review-local in .claude/commands/ai-review-local.md:L201-L205 and .claude/commands/ai-review-local.md:L216-L220. Impact: Step 3 may auto-commit all changes first in .claude/commands/ai-review-local.md:L81-L89, but /pre-merge-check only inspects uncommitted, staged, and untracked files in .claude/commands/pre-merge-check.md:L12-L20. Following the command’s own next-step text can therefore run the safeguard against an empty tree again. Concrete fix: remove /pre-merge-check from the post-review next steps, or extend /pre-merge-check to accept the committed diff range produced in Step 3 and have /ai-review-local pass that range explicitly.

Tech Debt
No findings. TODO.md does not track the P1 issues above, so they remain unmitigated.

Security
No separate unmitigated finding beyond the comparison-ref P1 above. The direct “upload diff without any pre-upload scan” blocker from the previous review is addressed by .claude/commands/ai-review-local.md:L104-L129.

Documentation/Tests

Severity P3 — The new tests cover the Python helper well, but not the command-level git/ref flow where the remaining failures live: tests/test_openai_review.py:L60-L307 versus .claude/commands/ai-review-local.md:L71-L114. Impact: the broken comparison-ref resolution and the post-review workflow regression can slip through without automated detection. Concrete fix: extract comparison-ref resolution and next-step selection into a testable helper, then add a regression case for a checkout with only origin/<default-branch> and a case where /ai-review-local has already auto-committed the work.

Path to Approval

Replace base_branch with a validated <comparison-ref> (<default-branch> if present locally, else origin/<default-branch>, else fetch), use that ref for both diff generation and secret scanning, and error out on failed git diff instead of converting it into “No changes to review.”
Remove the Step 7 recommendation to run /pre-merge-check after /ai-review-local, or extend /pre-merge-check so it can inspect the committed diff range created by Step 3 and wire /ai-review-local to call it that way.

- Use validated comparison-ref with fallback chain (local → origin → fetch) matching push-pr-update pattern, instead of bare branch name - Remove /pre-merge-check from post-review next steps (it runs before /ai-review-local per CLAUDE.md workflow order) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

igerber · 2026-03-21T23:29:11Z

/ai-review

github-actions · 2026-03-21T23:35:34Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 085720822930dbad146dfb5c76259099801f6c32

Overall Assessment

✅ Looks good

Executive Summary

Re-review scope: both previously reported P1s appear addressed. .claude/commands/ai-review-local.md now resolves a validated comparison_ref and reuses it for both diff generation and pre-upload secret scanning (.claude/commands/ai-review-local.md:L71-L147).
The prior workflow regression is also fixed: post-review guidance no longer sends users back to /pre-merge-check, and the repo workflow now places /pre-merge-check before /ai-review-local (.claude/commands/ai-review-local.md:L217-L245, CLAUDE.md:L146-L150).
No estimator/math/weighting/variance/SE/identification/default-behavior code changed in diff_diff/, so I found no methodology defects. I spot-checked the helper’s section mapping against existing headings in docs/methodology/REGISTRY.md (.claude/scripts/openai_review.py:L45-L70, docs/methodology/REGISTRY.md:L33-L1829).
The new test file is well targeted for the Python helper: it covers registry mapping, prompt adaptation against the current reviewer prompt, and re-review prompt assembly (tests/test_openai_review.py:L60-L307).
One informational gap remains: the markdown-command orchestration itself still lacks direct regression coverage.

Methodology

No findings. This PR adds local-review tooling and tests only; it does not modify any estimator, weighting, variance/SE, identification assumption, or default behavior in diff_diff/.

Code Quality

No findings. The prior comparison-ref issue appears resolved by the new fallback/validation flow and consistent reuse of comparison_ref throughout the diff/scan steps (.claude/commands/ai-review-local.md:L71-L147).

Performance

No findings.

Maintainability

No findings. The prior /pre-merge-check handoff regression appears resolved: the local-review command now routes clean and non-blocking outcomes forward instead of back into the working-tree-only check (.claude/commands/ai-review-local.md:L217-L245, CLAUDE.md:L146-L150).

Tech Debt

No findings.

Security

No findings. The previous unscanned-upload concern is addressed by the added pre-upload secret scan in .claude/commands/ai-review-local.md:L122-L147.

Documentation/Tests

Severity P3 — The added tests cover .claude/scripts/openai_review.py, but not the command-layer git/ref orchestration in .claude/commands/ai-review-local.md:L71-L147. Impact: future regressions in comparison-ref resolution, fetch-failure handling, or secret-scan sequencing can slip through without automated detection. Concrete fix: extract the comparison-ref/diff-range logic into a small helper or add command-integration coverage for at least the origin/<default_branch>-only and failed-fetch-abort paths.
Validation note: I could not execute tests/test_openai_review.py locally because pytest is not installed in this environment.

CI installs the package via pip into a temp directory, so .claude/scripts/openai_review.py isn't at the expected path. Tests now discover the script via git repo root fallback and skip the entire module when the script is not found. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

igerber and others added 2 commits March 21, 2026 18:36

Fix P2 findings from local AI review

ae4527e

- Fix missing f-string on timeout error message - Clarify docs: skill creates review artifacts, may auto-commit - Step 7: offer to address findings at any severity, not just P0/P1 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

igerber and others added 3 commits March 21, 2026 18:55

igerber merged commit 2f3e2a9 into main Mar 22, 2026
13 checks passed

igerber deleted the pr-improvement branch March 22, 2026 01:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add local AI review skill (/ai-review-local)#227

Add local AI review skill (/ai-review-local)#227
igerber merged 7 commits intomainfrom
pr-improvement

igerber commented Mar 21, 2026

Uh oh!

github-actions bot commented Mar 21, 2026

Uh oh!

igerber commented Mar 21, 2026

Uh oh!

github-actions bot commented Mar 21, 2026

Uh oh!

igerber commented Mar 21, 2026

Uh oh!

github-actions bot commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Mar 21, 2026

Summary

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions bot commented Mar 21, 2026

PR Review

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Path to Approval

Uh oh!

igerber commented Mar 21, 2026

Uh oh!

github-actions bot commented Mar 21, 2026

Executive summary

Path to Approval

Uh oh!

igerber commented Mar 21, 2026

Uh oh!

github-actions bot commented Mar 21, 2026

Overall Assessment

Executive Summary

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant