agent-browser
Browser automation CLI designed for AI agents. Compact text output minimizes context usage. 100% native Rust.
npm install -g agent-browser # all platforms
brew install agent-browser # macOS
agent-browser install # Download Chrome (first time)
# or try without installing
npx agent-browser open example.comFeatures
- Agent-first - Compact text output uses fewer tokens than JSON, designed for AI context efficiency
- Ref-based - Snapshot returns accessibility tree with refs for deterministic element selection
- Fast - Native Rust CLI for instant command parsing
- Complete - 50+ commands for navigation, forms, screenshots, network, storage
- Sessions - Multiple isolated browser instances with separate auth
- Cross-platform - macOS, Linux, Windows with native binaries
Works with
Claude Code, Cursor, GitHub Copilot, OpenAI Codex, Google Gemini, opencode, and any agent that can run shell commands.
Example
# Navigate and get snapshot
agent-browser open example.com
agent-browser snapshot -i
# Output:
# - heading "Example Domain" [ref=e1]
# - link "More information..." [ref=e2]
# Interact using refs
agent-browser click @e2
agent-browser screenshot page.png
agent-browser closeWhy refs?
The snapshot command returns a compact accessibility tree where each element
has a unique ref like @e1, @e2. This provides:
- Context-efficient - Text output uses ~200-400 tokens vs ~3000-5000 for full DOM
- Deterministic - Ref points to exact element from snapshot
- Fast - No DOM re-query needed
- AI-friendly - LLMs parse text output naturally
Architecture
Client-daemon architecture for optimal performance:
- Rust CLI - Parses commands, communicates with daemon
- Native Daemon - Pure Rust daemon using direct CDP, manages Chrome via Chrome DevTools Protocol
Daemon starts automatically and persists between commands.
Platforms
Native Rust binaries for macOS (ARM64, x64), Linux (ARM64, x64), and Windows (x64).