- uv installed
- Python 3.12+
claudeCLI available on PATH
cd cc-experiment-runner
uv sync# Via the installed console script
cc-run-experiments <directory> <prefix> [baseline-branch]
# Or via uv run
uv run cc-run-experiments <directory> <prefix> [baseline-branch]
# Or as a module
uv run python -m cc_experiment_runner <directory> <prefix> [baseline-branch]Examples:
# Run with plugin enabled (default)
cc-run-experiments ../byopl24-02 2025-01-30--perf main
# Run without plugin
cc-run-experiments --no-plugin ../byopl24-02 2025-01-30--perf maindirectory- Path to the project directory where Claude and git operations runprefix- Unique identifier for this analysis run (used in branch names)baseline-branch- (Optional) Git branch to use as baseline (default:main)
--no-plugin- Run analysis without the plugin (default: with plugin enabled)
- Creates isolated branches for each iteration:
<prefix>-run-<N>-iteration-<M> - Runs Claude autonomously with your prompt for multiple iterations
- Each iteration builds upon the previous one's improvements
- Commits changes automatically with descriptive messages
- Runs benchmarks at the end of each run
flowchart TD
Start([cc-run-experiments]) --> RunLoop[/"For each run (1..N)"/]
RunLoop --> IterLoop[/"For each iteration (1..M)"/]
IterLoop --> CheckTime{"Enough time<br>remaining?"}
CheckTime -- No --> RunBenchmarks
CheckTime -- Yes --> RunClaude[Run Claude with<br>iteration prompt]
RunClaude --> ExitCode{Exit code?}
ExitCode -- "0 (success)" --> Commit[Commit changes]
Commit --> NextIter{"More<br>iterations?"}
ExitCode -- "2 (rate limit)" --> Stop([Stop])
ExitCode -- "124 (timeout)" --> RetryCheck{"Retries<br>exhausted?"}
RetryCheck -- No --> Revert[Revert changes]
Revert --> RunClaude
RetryCheck -- Yes --> Stop
ExitCode -- "other error" --> ConsecCheck{"Too many consecutive<br>failures?"}
ConsecCheck -- Yes --> Stop
ConsecCheck -- No --> SkipIter[Skip iteration]
SkipIter --> NextIter
NextIter -- Yes --> IterLoop
NextIter -- No --> RunBenchmarks[Run benchmarks]
RunBenchmarks --> NextRun{More runs?}
NextRun -- Yes --> RunLoop
NextRun -- No --> Done([Done])
Default settings in src/cc_experiment_runner/config.py:
- Iterations per run: 10
- Total runs: 5
- Timeout per run: 2 hours
CC_PLUGIN_DIR- Path to the Truffle performance plugin directory. Required when running with plugin enabled (the default). Example:export CC_PLUGIN_DIR=~/Projects/cc-truffle-performance-plugin
The runner invokes claude with the --dangerously-skip-permissions flag. This is required because the experiment runner operates autonomously across multiple iterations without user interaction. The flag allows Claude to execute tools (file edits, shell commands, git operations) without prompting for confirmation at each step. Only use this tool in sandboxed or trusted environments, as it gives Claude unrestricted access to the project directory.
- Branches: One per iteration (
<prefix>-run-<N>-iteration-<M>) - Benchmark results: CSV files in
benchmark-results/
- Rate limit: Exits immediately
- 2 consecutive failures: Assumes persistent issue and stops
- Timeout: Commits partial changes and moves to next run
- Single error: Skips iteration and continues
cc-experiment-runner/
├── pyproject.toml
├── README.md
└── src/
└── cc_experiment_runner/
├── __init__.py
├── __main__.py # python -m entry point
├── cli.py # argument parsing and main orchestration
├── config.py # configuration constants
├── git.py # git helper functions
├── process.py # process termination utilities
├── claude.py # Claude CLI invocation and error detection
└── benchmarks.py # benchmark runner
- Use a descriptive prefix with date:
2025-01-30--feature-name - Keep prompt files focused on specific optimization goals
- The script creates many branches - clean up old ones periodically