Skip to content

Add local AI review skill (/ai-review-local)#227

Merged
igerber merged 7 commits intomainfrom
pr-improvement
Mar 22, 2026
Merged

Add local AI review skill (/ai-review-local)#227
igerber merged 7 commits intomainfrom
pr-improvement

Conversation

@igerber
Copy link
Owner

@igerber igerber commented Mar 21, 2026

Summary

  • Add /ai-review-local skill that calls the OpenAI Chat Completions API (gpt-5.4) to review code changes locally before opening a PR
  • Add .claude/scripts/openai_review.py — standalone Python script (stdlib only) that compiles the review prompt, selectively extracts REGISTRY.md sections via prefix-based matching, and calls the API
  • Enables iterative review/revision cycles with Claude Code to catch methodology, edge-case, and pattern issues before CI review
  • Supports re-review mode (automatically includes previous review for follow-up), --full-registry, --dry-run, and --model override
  • Update .gitignore to allow .claude/scripts/ to be committed
  • Update CLAUDE.md workflow to include /ai-review-local in pre-submission steps

Methodology references (required if estimator / math changes)

  • Method name(s): N/A — no methodology changes
  • Paper / source link(s): N/A
  • Any intentional deviations from the source (and why): None

Validation

  • Tests added/updated: No test changes (tooling/infrastructure only)
  • Dry-run verified: prompt compilation, REGISTRY.md section extraction, re-review mode
  • Live tested: end-to-end API call with gpt-5.4, re-review cycle confirmed previous findings tracked
  • Backtest / simulation / notebook evidence (if applicable): N/A

Security / privacy

  • Confirm no secrets/PII in this PR: Yes
  • Note: The script reads OPENAI_API_KEY from the environment and never logs it. Diffs and selected registry text are transmitted to OpenAI when not using --dry-run.

Generated with Claude Code

igerber and others added 2 commits March 21, 2026 18:36
New skill that calls the OpenAI Chat Completions API to review code
changes locally before opening a PR. Enables iterative review/revision
cycles with Claude Code to catch methodology, edge-case, and pattern
issues before CI review.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Fix missing f-string on timeout error message
- Clarify docs: skill creates review artifacts, may auto-commit
- Step 7: offer to address findings at any severity, not just P0/P1

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@github-actions
Copy link

PR Review

Overall assessment: ⛔ Blocker

Executive Summary

  • No estimator, weighting, variance/SE, or identification logic changed in this PR, so the core Methodology Registry deviation rules are not directly implicated.
  • Severity P0: /ai-review-local uploads the branch diff to OpenAI but does not run the repo’s canonical secret scan first, creating a new accidental-secret exfiltration path.
  • Severity P1: base-branch detection uses @{upstream} as if it were the PR base; on a normal pushed feature branch that commonly resolves to the current branch name, so the generated diff can be empty.
  • Severity P1: CLAUDE.md now tells users to run /ai-review-local before /pre-merge-check, but /ai-review-local may auto-commit everything and /pre-merge-check only inspects uncommitted/staged/untracked changes, so the second safeguard can silently do nothing.

Methodology

No estimator/math/SE changes were introduced. One tooling issue still affects methodology review coverage:

  • Severity P2 — Selective registry extraction misses visualization submodules. Impact: ._needed_sections() in .claude/scripts/openai_review.py:L64-L75 strips diff_diff/visualization/_event_study.py down to _event_study.py, so the existing "visualization" mapping in .claude/scripts/openai_review.py:L21-L50 never fires. In default selective mode, local reviews of plotting changes omit the Event Study Plotting registry section. Concrete fix: match on repo-relative path (or directory + basename) instead of basename only, and add a regression test for diff_diff/visualization/_event_study.py.

Code Quality

I did not treat use of v1/chat/completions itself as a defect; Chat Completions remains a documented OpenAI API endpoint. (platform.openai.com)

  • Severity P1 — Incorrect comparison base in .claude/commands/ai-review-local.md:L66-L97. Impact: git rev-parse --abbrev-ref @{upstream} | sed 's|origin/||' turns a common upstream like origin/feature/foo into feature/foo, and the later git diff "${base_branch}...HEAD" then compares the branch to itself. Users on a pushed feature branch can get a false “No committed changes” result even when there are commits to review. The repo already uses a safer comparison-ref pattern in .claude/commands/push-pr-update.md:L57-L83. Concrete fix: resolve a real comparison ref from the repo default branch / PR base, or use @{u}..HEAD only when you intentionally mean “ahead of upstream”; add regression coverage for a branch tracking origin/<feature>.

Performance

  • No material performance findings in the diff.

Maintainability

  • Severity P1 — The new documented workflow disables the next safeguard. Impact: CLAUDE.md now says to run /ai-review-local before /pre-merge-check in .CLAUDE.md:L146-L150, but /ai-review-local may auto-commit all changes in .claude/commands/ai-review-local.md:L77-L92, while /pre-merge-check only inspects git diff HEAD, git diff --cached, and untracked files in .claude/commands/pre-merge-check.md:L12-L20. Following the documented sequence can therefore make /pre-merge-check inspect an empty working tree. Concrete fix: either make /ai-review-local read-only/non-committing, or move /pre-merge-check back before it and teach /pre-merge-check to handle committed diff ranges.
  • Severity P3 — Dead commit-log path / misleading prompt adaptation. Impact: Step 3 writes /tmp/ai-review-commits.txt in .claude/commands/ai-review-local.md:L87-L92, and _adapt_review_criteria() says commit messages are used in .claude/scripts/openai_review.py:L158-L163, but compile_prompt() only injects branch_info in .claude/scripts/openai_review.py:L206-L214. Concrete fix: either remove the unused commit-log artifact and wording, or pass commit messages explicitly and delimit them as untrusted context.

Tech Debt

  • No relevant mitigation was added in TODO.md, and none of the P0/P1 issues above are tracked there, so they remain unmitigated.

Security

Documentation/Tests

Path to Approval

  1. Add a pre-upload secret scan using the canonical /pre-merge-check patterns against the exact diff range to be sent to OpenAI, and abort or require explicit confirmation on matches.
  2. Fix comparison-ref selection so local review compares against the PR base/default branch rather than the current branch’s upstream name; add a regression test for a pushed feature branch tracking origin/<feature>.
  3. Remove the workflow contradiction between /ai-review-local and /pre-merge-check: either make /ai-review-local non-committing/read-only, or document/run /pre-merge-check first and extend it to inspect committed diff ranges.

igerber and others added 3 commits March 21, 2026 18:55
- P0: Add secret scan (canonical patterns) before API upload
- P1: Fix base branch detection using gh repo view instead of @{upstream}
- P1: Swap workflow order: /pre-merge-check before /ai-review-local
- P2: Fix visualization submodule path matching in _needed_sections()
- P3: Remove dead commit-log artifact, fix constraints section wording,
  remove misleading commit-message text from prompt adaptation

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Step 7: show full findings summary grouped by severity before prompting
- Add data-transmission privacy note to skill docs
- Add maintenance comment on PREFIX_TO_SECTIONS mapping

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Add 29 unit tests for openai_review.py (registry extraction, path
  matching, prompt adaptation, compilation, mapping coverage)
- Validate prompt substitutions warn on source drift
- Document skill/script contract in module docstring
- Add maintenance comment on PREFIX_TO_SECTIONS mapping

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@igerber
Copy link
Owner Author

igerber commented Mar 21, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: a15caf369aa758bb28417d4df4f664bef7c77733


Overall assessment: ⚠️ Needs changes

Executive summary

  • No estimator, weighting, variance/SE, identification, or default-behavior code changed here, so I found no methodology defects in the library itself.
  • The prior P0 about uploading an unscanned diff is addressed: Step 3b now adds a pre-upload secret scan in .claude/commands/ai-review-local.md:L104-L129.
  • The prior registry-selection gap for visualization files also looks fixed: _needed_sections() now maps diff_diff/visualization/*, with regression coverage in tests/test_openai_review.py:L65-L76 and tests/test_openai_review.py:L283-L292.
  • Severity P1: /ai-review-local still assumes the repo default branch exists as a local ref; on clones that only have origin/<default-branch>, the diff/scan path can fail or short-circuit instead of reviewing the branch.
  • Severity P1: the command still tells users to run /pre-merge-check after /ai-review-local, even though /ai-review-local may auto-commit first and /pre-merge-check only inspects working-tree changes.

Methodology
No findings. No estimator/math/SE/assumption changes were introduced. The methodology-review helper’s prior visualization omission appears fixed in .claude/scripts/openai_review.py:L85-L103.

Code Quality

  • Severity P1 — Incomplete comparison-ref resolution in .claude/commands/ai-review-local.md:L71-L114. Impact: the command records only the default branch name, then runs git diff "${base_branch}...HEAD" directly. Unlike the existing safer pattern in .claude/commands/push-pr-update.md:L63-L67, it never falls back to origin/<default-branch> or fetches the ref first. On shallow/single-branch clones, that can make diff generation fail; Step 3b suppresses diff errors, and the Python script treats an empty diff file as “No changes to review” in .claude/scripts/openai_review.py:L439-L442. Concrete fix: resolve a validated <comparison-ref> exactly as /push-pr-update does, reuse it for both diff generation and secret scanning, and fail on non-zero git diff instead of treating the result as an empty diff.

Performance
No findings.

Maintainability

Tech Debt
No findings. TODO.md does not track the P1 issues above, so they remain unmitigated.

Security
No separate unmitigated finding beyond the comparison-ref P1 above. The direct “upload diff without any pre-upload scan” blocker from the previous review is addressed by .claude/commands/ai-review-local.md:L104-L129.

Documentation/Tests

  • Severity P3 — The new tests cover the Python helper well, but not the command-level git/ref flow where the remaining failures live: tests/test_openai_review.py:L60-L307 versus .claude/commands/ai-review-local.md:L71-L114. Impact: the broken comparison-ref resolution and the post-review workflow regression can slip through without automated detection. Concrete fix: extract comparison-ref resolution and next-step selection into a testable helper, then add a regression case for a checkout with only origin/<default-branch> and a case where /ai-review-local has already auto-committed the work.

Path to Approval

  1. Replace base_branch with a validated <comparison-ref> (<default-branch> if present locally, else origin/<default-branch>, else fetch), use that ref for both diff generation and secret scanning, and error out on failed git diff instead of converting it into “No changes to review.”
  2. Remove the Step 7 recommendation to run /pre-merge-check after /ai-review-local, or extend /pre-merge-check so it can inspect the committed diff range created by Step 3 and wire /ai-review-local to call it that way.

- Use validated comparison-ref with fallback chain (local → origin →
  fetch) matching push-pr-update pattern, instead of bare branch name
- Remove /pre-merge-check from post-review next steps (it runs before
  /ai-review-local per CLAUDE.md workflow order)

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@igerber
Copy link
Owner Author

igerber commented Mar 21, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: 085720822930dbad146dfb5c76259099801f6c32


Overall Assessment

✅ Looks good

Executive Summary

  • Re-review scope: both previously reported P1s appear addressed. .claude/commands/ai-review-local.md now resolves a validated comparison_ref and reuses it for both diff generation and pre-upload secret scanning (.claude/commands/ai-review-local.md:L71-L147).
  • The prior workflow regression is also fixed: post-review guidance no longer sends users back to /pre-merge-check, and the repo workflow now places /pre-merge-check before /ai-review-local (.claude/commands/ai-review-local.md:L217-L245, CLAUDE.md:L146-L150).
  • No estimator/math/weighting/variance/SE/identification/default-behavior code changed in diff_diff/, so I found no methodology defects. I spot-checked the helper’s section mapping against existing headings in docs/methodology/REGISTRY.md (.claude/scripts/openai_review.py:L45-L70, docs/methodology/REGISTRY.md:L33-L1829).
  • The new test file is well targeted for the Python helper: it covers registry mapping, prompt adaptation against the current reviewer prompt, and re-review prompt assembly (tests/test_openai_review.py:L60-L307).
  • One informational gap remains: the markdown-command orchestration itself still lacks direct regression coverage.

Methodology

  • No findings. This PR adds local-review tooling and tests only; it does not modify any estimator, weighting, variance/SE, identification assumption, or default behavior in diff_diff/.

Code Quality

  • No findings. The prior comparison-ref issue appears resolved by the new fallback/validation flow and consistent reuse of comparison_ref throughout the diff/scan steps (.claude/commands/ai-review-local.md:L71-L147).

Performance

  • No findings.

Maintainability

  • No findings. The prior /pre-merge-check handoff regression appears resolved: the local-review command now routes clean and non-blocking outcomes forward instead of back into the working-tree-only check (.claude/commands/ai-review-local.md:L217-L245, CLAUDE.md:L146-L150).

Tech Debt

  • No findings.

Security

  • No findings. The previous unscanned-upload concern is addressed by the added pre-upload secret scan in .claude/commands/ai-review-local.md:L122-L147.

Documentation/Tests

  • Severity P3 — The added tests cover .claude/scripts/openai_review.py, but not the command-layer git/ref orchestration in .claude/commands/ai-review-local.md:L71-L147. Impact: future regressions in comparison-ref resolution, fetch-failure handling, or secret-scan sequencing can slip through without automated detection. Concrete fix: extract the comparison-ref/diff-range logic into a small helper or add command-integration coverage for at least the origin/<default_branch>-only and failed-fetch-abort paths.
  • Validation note: I could not execute tests/test_openai_review.py locally because pytest is not installed in this environment.

CI installs the package via pip into a temp directory, so
.claude/scripts/openai_review.py isn't at the expected path.
Tests now discover the script via git repo root fallback and
skip the entire module when the script is not found.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@igerber igerber merged commit 2f3e2a9 into main Mar 22, 2026
13 checks passed
@igerber igerber deleted the pr-improvement branch March 22, 2026 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant