Static leaderboard for completed PromptSecurityEval experiments.
From repo root:
python leaderboard_site/scripts/build_leaderboard_data.py \
--input-dir experiments/placeholders \
--output leaderboard_site/data/leaderboard.jsonFor static hosting (GitHub Pages), bundle completed run payload JSONs too:
python leaderboard_site/scripts/build_leaderboard_data.py \
--input-dir experiments/placeholders \
--output leaderboard_site/data/leaderboard.json \
--bundle-runs-dir leaderboard_site/data/runsFrom repo root:
python -m http.server 8080Open:
http://localhost:8080/leaderboard_site/
The page includes a live traffic row (Total / This Page / Last 30 Days).
Configure in leaderboard_site/index.html:
<body data-goatcounter-code="your-goatcounter-code">Examples:
data-goatcounter-code="promptsecurityeval"data-goatcounter-code="https://promptsecurityeval.goatcounter.com"
Notes:
- The site auto-loads GoatCounter
count.jsand refreshes counters every 30s. - If counters fail, check GoatCounter site settings for visitor count visibility/API access.
Workflow file:
/.github/workflows/deploy_leaderboard_pages.yml
This workflow builds from the private main repo and publishes the static site to a separate public repo branch.
- Create a public repo (example:
your-org/promptsecurityeval-leaderboard). - In that public repo, enable GitHub Pages:
- Settings -> Pages
- Source:
Deploy from a branch - Branch:
gh-pages(or the branch you choose), folder:/ (root)
In Settings -> Secrets and variables -> Actions:
- Add repository variable:
LEADERBOARD_PUBLIC_REPO=owner/repoof the public target repo.
- Optional repository variable:
LEADERBOARD_PUBLIC_BRANCH= publish branch (defaultgh-pages).
- Add repository secret:
LEADERBOARD_PUBLISH_TOKEN= PAT that can push to the target public repo.
PAT recommendation:
- Fine-grained PAT with
Contents: Read and writeon the target public repo only. - Classic PAT with
reposcope also works, but is broader.
- Push to
main(auto trigger), or - GitHub -> Actions ->
Publish Leaderboard To Public Repo->Run workflow.
The workflow rebuilds leaderboard_site/ and force-publishes it as an orphan commit to the target branch.
- Reads only completed-like runs (
success,completed,complete). ASRis derived from sample-level judger outputs:0means safe.1means unsafe.- multi-judger dict/list values are averaged to
[0,1].
- Matrix prefers
no_defenseruns; if missing, it falls back to all-defense averages. - With
--bundle-runs-dir, run payload files are copied and run paths are rewritten todata/runs/*.json. - Publishing this site means run payload data is publicly accessible in the target repo Pages site.