AI agents interact with external services through two dominant paradigms today: CLIs which were originally built for humans, and structured tool protocols like MCP. Both impose significant overhead.
AXI is a new paradigm — agent-native CLI tools built from 10 design principles that treat token budget as a first-class constraint.
Evaluated across 490 runs (14 tasks × 7 conditions × 5 repeats) using Claude Sonnet 4.6:
| Condition | Success | Avg Cost | Avg Duration | Avg Turns |
|---|---|---|---|---|
| chrome-devtools-axi | 100% | $0.074 | 21.5s | 4.5 |
| dev-browser | 99% | $0.078 | 28.6s | 4.9 |
| agent-browser | 99% | $0.088 | 24.6s | 4.8 |
| chrome-devtools-mcp-compressed | 100% | $0.091 | 29.7s | 7.6 |
| chrome-devtools-mcp-search | 99% | $0.096 | 29.4s | 7.5 |
| chrome-devtools-mcp | 99% | $0.101 | 26.0s | 6.2 |
| chrome-devtools-mcp-code | 100% | $0.120 | 36.2s | 6.4 |
Evaluated across 425 runs (17 tasks × 5 conditions × 5 repeats) using Claude Sonnet 4.6:
| Condition | Success | Avg Cost | Avg Duration | Avg Turns |
|---|---|---|---|---|
| gh-axi | 100% | $0.050 | 15.7s | 3 |
| gh (CLI) | 86% | $0.054 | 17.4s | 3 |
| GitHub MCP | 87% | $0.148 | 34.2s | 6 |
| GitHub MCP + ToolSearch | 82% | $0.147 | 41.1s | 8 |
| MCP + Code Mode | 84% | $0.101 | 43.4s | 7 |
Reference AXI implementations:
gh-axi— GitHub operationschrome-devtools-axi— Browser automation
npm install -g gh-axi
npm install -g chrome-devtools-axiAdd to your CLAUDE.md or AGENTS.md:
Use `gh-axi` for GitHub and `chrome-devtools-axi` for browser automation.
These principles define what makes a CLI tool "an AXI":
| # | Principle | Summary |
|---|---|---|
| 1 | Token-efficient output | Use TOON format for ~40% token savings over JSON |
| 2 | Minimal default schemas | 3–4 fields per list item, not 10 |
| 3 | Content truncation | Truncate large text with size hints and --full escape hatch |
| 4 | Pre-computed aggregates | Include aggregated counts and statuses that eliminate round trips |
| 5 | Definitive empty states | Explicit "0 results" rather than ambiguous empty output |
| 6 | Structured errors & exit codes | Idempotent mutations, structured errors, no interactive prompts |
| 7 | Ambient context | Self-install into session hooks so agents see state before invoking |
| 8 | Content first | Running with no arguments shows live data, not help text |
| 9 | Contextual disclosure | Include next-step suggestions after each output |
| 10 | Consistent way to get help | Concise per-subcommand reference when agents need it |
Install the AXI skill to get the design guidelines and scaffolding for building an AXI-compliant CLI:
npx skills add kunchenguid/axiThis installs the AXI skill — a detailed guide with examples for each principle that your coding agent can reference while building.
The browser benchmark harness lives in bench-browser/. It compares browser automation tools across 16 browsing tasks.
cd bench-browser
npm install
# Run a single condition × task
npm run bench -- run --condition chrome-devtools-axi --task read_static_page
# Run the full matrix
npm run bench -- matrix --repeat 5
# Generate summary report
npm run bench -- reportPublished results (490 runs): bench-browser/published-results/report.md
The GitHub benchmark harness lives in bench-github/. It runs agent tasks across different interface conditions and grades results with an LLM judge.
cd bench-github
npm install
# Run a single condition × task
npm run bench -- run --condition axi --task merged_pr_ci_audit --repeat 5 --agent claude
# Run the full matrix
npm run bench -- matrix --repeat 5 --agent claude
# Generate summary report
npm run bench -- reportPublished results (425 runs): bench-github/published-results/STUDY.md
