soma-evals

Schema-ablation evals for SOMA — measures how progressively richer LinkML schema context improves LLM-based structured extraction from scientific literature.

Prerequisites

uv (Python package manager)
just (task runner)
Python 3.12+

Setup

git clone https://github.com/EHS-Data-Standards/soma-evals.git
cd soma-evals
just setup

API keys

Set keys via the llm key store or environment variables. Use whichever method you prefer — you only need keys for the providers whose models you plan to run.

Option A — key store (recommended):

uv run llm keys set openai       # paste your OpenAI key
uv run llm keys set anthropic    # paste your Anthropic key
uv run llm keys set gemini       # paste your Gemini key

Option B — .env file:

cp .env.example .env

Then edit .env:

OPENAI_API_KEY=sk-your-key-here
ANTHROPIC_API_KEY=sk-ant-your-key-here
GEMINI_API_KEY=AIyour-key-here

CBORG users (LBNL staff): Models prefixed with cborg/ route through the CBORG proxy and are free for lab staff. Authentication is handled by CBORG — no extra API key is needed beyond your CBORG access.

Running evals

just list-models        # show available models & tiers
just run-all            # run all four ablation levels (standard tier)
just run-baseline       # run a single level

Run a specific tier or override the default paper:

just run-all cheap
EVAL_PDF=my-paper.pdf EVAL_SLUG=my-slug just run-all

Ablation levels

Level	Schema context provided
`baseline`	None — LLM relies on training knowledge only
`class_names`	Class names, descriptions, and mappings
`full_classes`	+ slot definitions with ranges & cardinality
`with_enums`	+ enumeration values and ontology meanings

Results are written to results/<level>/<model>/<paper>.yaml.

Tests & QC

just test       # run tests (no API calls)
just coverage   # tests with coverage report
just fix        # auto-fix lint/format (ruff)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
src		src
.env.example		.env.example
.gitignore		.gitignore
Container-montgomery2020-pm25-mucociliary.yaml		Container-montgomery2020-pm25-mucociliary.yaml
LICENSE		LICENSE
README.md		README.md
extract.yaml		extract.yaml
justfile		justfile
mkdocs.yml		mkdocs.yml
models.yaml		models.yaml
pyproject.toml		pyproject.toml
rcmb.2019-0454OC.pdf		rcmb.2019-0454OC.pdf
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

soma-evals

Prerequisites

Setup

API keys

Running evals

Ablation levels

Tests & QC

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

soma-evals

Prerequisites

Setup

API keys

Running evals

Ablation levels

Tests & QC

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages