Updated Mar. 20, 2026

AI Capabilities

Our database of benchmark results, featuring the performance of leading AI models on challenging tasks. It includes results from benchmarks evaluated internally by Epoch AI as well as data collected from external sources. Explore trends in AI capabilities across time, by benchmark, or by model.

Benchmarking updates

Mar. 9, 2026

We added APEX-Agents, ARC-AGI-2, and HLE to the Epoch Capabilities Index. GPT-5.4 Pro now leads, narrowly ahead of Gemini 3.1 Pro.

See the update

Mar. 5, 2026

GPT-5.4 Pro set a new record on FrontierMath, scoring 50% on Tiers 1–3 and 38% on Tier 4. We also evaluated it on FrontierMath: Open Problems.

Read the thread

Jan. 27, 2026

We released FrontierMath: Open Problems, which tests AI on unsolved math research problems.

Discover the benchmark

Trusted by leaders at OpenAI, DeepMind,
and governments worldwide

Need deeper insights? Our team offers custom research and advisory services.

Book a consultation

Research & Commentary

More

Datasets

Benchmarking Data

By Epoch AI

AI Capabilities

Epoch AI–run benchmarks

Benchmark creator–run benchmarks

Model developer–run benchmarks

Benchmarking updates

Trusted by leaders at OpenAI, DeepMind,
and governments worldwide

Research & Commentary

More

Datasets

Benchmarking Data

By Epoch AI

AI Trends & Statistics

Papers & Reports

Newsletter: Gradient Updates

Data Insights

Podcast: Epoch After Hours

Models

Frontier Data Centers

Hardware

Companies

Chip Sales

Polling on Usage

AI Capabilities

FrontierMath

AI Capabilities

Epoch AI–run benchmarks

Benchmark creator–run benchmarks

Model developer–run benchmarks

Benchmarking updates

Trusted by leaders at OpenAI, DeepMind, and governments worldwide

Trusted by leaders at OpenAI, DeepMind,
and governments worldwide