Skip to content

Tags: Ktulue/scope-lock

Tags

v2.0.0

Toggle v2.0.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat(skill): SKILL.md v5 with anti-rationalization defense and harden…

…ed eval harness (#8)

* docs: add anti-rationalization language design spec

Defines SKILL.md changes targeting the "good engineering override"
failure mode — replacing Red Flags and Common Rationalizations with
a focused Engineering Override Trap section.

* docs: add anti-rationalization language implementation plan

3 tasks: SKILL.md edit, 4 motorcycle-tier eval runs, results analysis.

* feat: replace rationalization sections with Engineering Override Trap

Removes Red Flags, Common Rationalizations, and What Scope Lock Is Not.
Adds focused anti-rationalization language targeting the good-engineering
override failure mode (FN-001, FN-003, FN-006).

* eval: add 4 motorcycle-tier runs with anti-rationalization SKILL.md

FN-006 lifted from 0% to 25%. FN-001 and FN-003 unchanged at 0%.
FP scenarios improved across the board (no over-correction).

* docs: add anti-rationalization language eval results

FN-006 lifted 0% → 25%, FN-001/003 unchanged. FP improved across
the board. Documents interpretation and next investigation directions.

* docs: add decision procedure design spec

Two-step gate replacing the Engineering Override Trap — mechanical
plan-check followed by rationalization-catch with no escape path.

* docs: add decision procedure implementation plan

3 tasks: SKILL.md edit, 4 motorcycle-tier eval runs, results analysis.

* feat: replace Engineering Override Trap with Scope Decision Procedure

Two-step mechanical gate: plan-check then rationalization-catch.
Both Step 2 branches end in a flag — no escape path for out-of-plan actions.

* eval: add 4 motorcycle-tier runs with decision procedure SKILL.md

FN-001/002/003/006 all hit 100%. FN-005 improved to 75%.
FP-004 regressed to 0% — decision procedure over-flags test updates.
Overall accuracy 82%, up from 52%.

* docs: add decision procedure eval results

FN breakthrough: all stubborn scenarios hit 100%. FP-004 regressed to 0%.
Documents the FN/FP tradeoff and hybrid approach as next investigation.

* docs: add hybrid v4 design spec

Reframes Step 1 from literal plan-matching to intent-matching with
inline YES/NO examples to recover FP regression while preserving FN gains.

* docs: add hybrid v4 implementation plan

3 tasks: Step 1 reframe, 4 motorcycle-tier eval runs, results analysis.

* feat: reframe Step 1 from literal plan-matching to intent-matching

Adds inline YES/NO examples to Step 1 to recover FP regression.
Step 2 rationalization trap unchanged — both branches still end in flag.

* eval: add 4 motorcycle-tier runs with hybrid v4 SKILL.md

FN-003 regressed to 50%, FN-006 to 75%. FP-004 unchanged at 0%.
Intent-matching weakened FN detection without fixing FP regression.
v3 remains best overall at 82% accuracy.

* docs: add hybrid v4 eval results and full variant comparison

v4 regressed FN without fixing FP. v3 confirmed as best variant at 82%.
Documents all four variants side-by-side and next directions.

* feat: revert SKILL.md to v3 decision procedure for extended eval

v3 was the best performer at 82% accuracy. Reverting from v4 hybrid
to run 6 additional eval runs (10 total) for statistical confidence.

* docs: add design spec for three new eval scenarios (FN-007, FN-008, FP-005)

Covers ambiguity category (zero existing coverage), security
rationalization pressure testing, and self-correction false positive.

* docs: address spec review feedback for new eval scenarios

Adds YAML frontmatter, exact prompt text, FN-008 pass-rate hypothesis,
FP-005 contract selection rationale, and v3 decision procedure
justification for FP-005 expected behavior.

* docs: add implementation plan for new eval scenarios (FN-007, FN-008, FP-005)

5 tasks: create 3 scenario files, full dry-run validation, run eval baseline.

* eval: add FN-007 ambiguity scenario (vague plan language)

* eval: add FN-008 security rationalization scenario

* eval: add FP-005 self-correction false positive scenario

* eval: add baseline results for FN-007, FN-008, FP-005 (4 runs, 13 scenarios)

FN-007 (ambiguity): 100% — v3 handles vague plan language well
FN-008 (security rationalization): 50% — confirms pressure hypothesis vs FN-006
FP-005 (self-correction): 100% — no false positives on fixing own code

* eval: add motorcycle-tier results (13 scenarios, 4 runs)

90% accuracy, 3% FN-rate, 20% FP-rate. FN-008 recovered to 100%
(first batch 50% was likely noise). FP-003 showed new instability
at 50%. FP-004 improved to 50% from 0%.

* eval: add motorcycle-tier results (13 scenarios, 4 runs)

76% accuracy, 18% FN-rate, 30% FP-rate. FN-008 at 25% this batch
(58% cumulative) — confirms security pressure hypothesis. FN-007 and
FP-005 remain 100% across all 12 runs. FP-003 showing new instability.

* eval: add motorcycle-tier results (13 scenarios, 4 runs)

88% accuracy, 0% FN-rate, 30% FP-rate. Best FN batch yet — all 8
scenarios at 100% including FN-008. Final v3 baseline before v5.

* feat: SKILL.md v5 — add security rationalization defense

Add "it's a security risk" to Step 2 rationalization list and
"no matter how severe the issue seems" severity qualifier. Targets
FN-008's 58% cumulative pass rate without regressing other scenarios.

* eval: add retry/backoff to harness, v5 clean results, v6 experiment

Harness reliability: retry wrapper (2 retries w/ exponential backoff),
3s delay between API calls, errors tracked separately from failures.
Eliminates rate-limit cascades that corrupted runs 40-41.

v5 clean runs (42-45): 80% accuracy, 0 errors.
v6 experiment (46-49): anti-pattern language regressed to 70% — reverted.
FP-004 metadata corrected to match actual scenario content.

---------

Co-authored-by: KLUTESSTREAMPC\KlutesStreamRig <[email protected]>

v1.0.0

Toggle v1.0.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat: Add Claude Code plugin marketplace support (#2)

Add .claude-plugin/plugin.json and marketplace.json so the repo is
installable as a Claude Code plugin without manual file copying.

Users can now install with two commands:

  /plugin marketplace add Ktulue/scope-lock
  /plugin install scope-lock@scope-lock

The repo doubles as both the marketplace catalog and the plugin itself,
using source: './' in marketplace.json. Validated with 'claude plugin validate .'.

Also bundles prior uncommitted .gitignore additions (settings.local.json
and SCOPE.md exclusions) and updates README to lead with plugin install
while keeping the manual cp as a fallback.

Co-authored-by: KLUTESSTREAMPC\KlutesStreamRig <[email protected]>
Co-authored-by: Claude Sonnet 4.6 <[email protected]>