Reusable GitHub workflow for benchmarking Rust PRs with iai-callgrind, criterion, or both (backend: all), and posting base-vs-head reports.
- Runs configured benchmark targets for a matrix of feature sets.
- Supports
iai-callgrind,criterion, orallvia thebackendinput. - Compares
head(github.sha) against PR base (pull_request.base.sha) in the same matrix job. - Publishes a sticky PR comment with grouped markdown tables and per-benchmark metric breakdowns.
- Optionally fails CI when regressions exceed a threshold.
Use:
terjekv/github-action-iai-callgrind/.github/workflows/rust-pr-bench.yml@v2
Compatibility path (deprecated but supported):
terjekv/github-action-iai-callgrind/.github/workflows/iai-callgrind-pr-bench.yml@v2
Most consumers will use one of these patterns at a time.
Best when you want one consolidated PR comment and one benchmark status check.
name: PR Bench
on:
pull_request:
jobs:
bench:
uses: terjekv/github-action-iai-callgrind/.github/workflows/rust-pr-bench.yml@v2
with:
backend: all
auto_discover: true
criterion_statistic: median
feature_sets_json: >-
[
{"name":"default","features":""},
{"name":"simd","features":"simd"}
]
regression_threshold_pct_iai_callgrind: 3
regression_threshold_pct_criterion: 10
fail_on_regression: trueBest when wall-clock benchmarking is what you care about and you want to tune Criterion's CLI settings.
name: Criterion Bench
on:
pull_request:
jobs:
bench:
uses: terjekv/github-action-iai-callgrind/.github/workflows/rust-pr-bench.yml@v2
with:
backend: criterion
auto_discover: true
criterion_cli_args: "--noplot --sample-size 80 --measurement-time 6"
criterion_statistic: median
regression_threshold_pct_criterion: 10
fail_on_regression: trueBest when you want stricter deterministic gating on callgrind event counts.
name: Callgrind Bench
on:
pull_request:
jobs:
bench:
uses: terjekv/github-action-iai-callgrind/.github/workflows/rust-pr-bench.yml@v2
with:
backend: iai-callgrind
auto_discover: true
feature_sets_json: >-
[
{"name":"default","features":""},
{"name":"simd","features":"simd"}
]
regression_threshold_pct_iai_callgrind: 3
fail_on_regression: trueBest when autodiscovery is not enough, or when each backend should target a different bench binary.
name: Explicit Bench Setup
on:
pull_request:
jobs:
bench:
uses: terjekv/github-action-iai-callgrind/.github/workflows/rust-pr-bench.yml@v2
with:
backend: all
working_directory: crates/engine
benchmarks_json: >-
[
{"name":"parser_callgrind","bench":"parser_callgrind","backend":"iai-callgrind"},
{"name":"parser_criterion","bench":"parser_criterion","backend":"criterion","criterion_args":"--noplot --sample-size 50"}
]
feature_sets_json: >-
[
{"name":"default","features":""},
{"name":"serde","features":"serde"}
]
regression_threshold_pct_criterion: 10backend(iai-callgrind|criterion|all, defaultiai-callgrind)- Selects benchmark backend(s) and reporting mode.
benchmarks_json(string, default[])- JSON array of benchmark specs.
- String entry means bench target name, e.g.
"parser_bench". - Object entry supports:
name: display namebench: cargo bench target name (forcargo bench --bench ...mode)command: full command overridemanifest_path,package,args: optional command helpersbackend: optional (iai-callgrindorcriterion) to include spec only for one backendcriterion_args: optional Criterion bench-binary args for this benchmark
auto_discover(boolean, defaulttrue)- When
benchmarks_jsonis empty, discovers benchmarks frombenches/*.rs. - Name-based backend routing for discovery:
- contains
criterion(and notcallgrind) => Criterion only - contains
callgrind(and notcriterion) => IAI-Callgrind only - otherwise => included for both backends
- contains
- When
feature_sets_json(string)- JSON array of feature-set objects:
name,features,no_default_features.
- JSON array of feature-set objects:
working_directory(string, default.)toolchain(string, defaultstable)cargo_args(string, appended to all commands)criterion_cli_args(string, default--noplot)- Added after
--for default Criterion commands. - This action does not override Criterion's sampling defaults unless you pass additional CLI args.
- Added after
criterion_statistic(mean|median, defaultmean)- Statistic used for Criterion base-vs-head comparison deltas.
base_sha(string, optional override)regression_threshold_pct(number, default3)regression_threshold_pct_iai_callgrind(number, default-1)- Optional backend-specific threshold override for
iai-callgrind. -1means "useregression_threshold_pct".
- Optional backend-specific threshold override for
regression_threshold_pct_criterion(number, default-1)- Optional backend-specific threshold override for
criterion. -1means "useregression_threshold_pct".
- Optional backend-specific threshold override for
fail_on_regression(boolean, defaultfalse)comment_mode(always|on-regression|never, defaultalways)action_repository(string, defaultterjekv/github-action-iai-callgrind)- Repository containing this reusable workflow and its scripts.
action_ref(string, default empty)- Ref (sha/tag/branch) for
action_repository. Required whenaction_repositoryis not the default.
- Ref (sha/tag/branch) for
By default, benchmarks are expected in Rust's standard benches/ folder.
You can override this by either:
- Setting
working_directoryfor workspace/member layouts. - Providing explicit
benchmarks_jsonentries. - Using
commandin a benchmark spec for custom invocation.
When benchmarks_json is empty and auto_discover: true, the workflow scans benches/*.rs.
Backend routing is based on the benchmark filename:
- contains
criterionand notcallgrind=> Criterion only - contains
callgrindand notcriterion=> IAI-Callgrind only - contains neither => included for both backends
Examples:
parser_callgrind.rs=> IAI-Callgrind onlyparser_criterion.rs=> Criterion onlyparser.rs=> both backends
This lets a repo keep both benchmark styles in one benches/ directory while still routing them predictably.
Use explicit benchmarks_json instead of autodiscovery when:
- the bench target names should not follow the filename convention
- different backends need different command lines
- benchmarks live outside the default
benches/layout - only a subset of benches should run in CI
- Lower values are treated as better for both backends:
iai-callgrind: callgrind summary event countscriterion: selected estimate statistic (meanormedian, unitns)
- With
backend: all, the workflow posts a single consolidated PR comment with one section per backend. - The workflow installs
valgrindandiai-callgrind-runneronly foriai-callgrind. - Markdown layout is template-driven for easier iteration:
scripts/templates/report_single.md.tmplscripts/templates/report_single_summary.md.tmplscripts/templates/report_single_history.md.tmplscripts/templates/report_combined.md.tmplscripts/templates/report_combined_backend_section.md.tmpl
- Benchmark command overrides can use placeholders:
{features}{no_default_features_flag}
By default, this action passes only --noplot to Criterion.
That means Criterion's own defaults still apply unless you override them:
- sample size:
100 - warm-up time:
3s - measurement time:
5s - noise threshold:
1%
On shared CI runners, seeing about 1-3% variation on unchanged code is not unusual. If you see that level of noise, treat Criterion as a higher-variance signal than iai-callgrind and tune it explicitly.
Recommended ways to reduce noise:
- Prefer
criterion_statistic: medianovermeanfor PR comparisons. - Increase
--sample-sizeand--measurement-time. - Raise
regression_threshold_pct_criterionabove your observed noise floor. - Use explicit
benchmarks_jsonentries with per-benchmarkcriterion_argsif only some benches are noisy. - Prefer dedicated or less contended runners if you want tighter regression gates.
Example tuned Criterion setup:
jobs:
bench:
uses: terjekv/github-action-iai-callgrind/.github/workflows/rust-pr-bench.yml@v2
with:
backend: criterion
auto_discover: true
criterion_cli_args: "--noplot --sample-size 120 --measurement-time 8"
criterion_statistic: median
regression_threshold_pct_criterion: 5
fail_on_regression: truePer-benchmark overrides are also supported:
benchmarks_json: >-
[
{
"name":"parser_criterion",
"bench":"parser_criterion",
"backend":"criterion",
"criterion_args":"--noplot --sample-size 200 --measurement-time 15"
}
]This repository includes a sample Rust project at examples/sample-rust-app.
- It has an
iai-callgrindbenchmark target:sample_callgrind_bench. - It has a
criterionbenchmark target:sample_criterion_bench. - It defines two feature sets:
defaultandalt-impl. - The workflow
.github/workflows/sample-self-test.ymlruns fast script/unit tests first, then validates explicit callgrind, explicit criterion, combinedbackend: all, and autodiscovery modes against the sample fixture on pull requests.