Multi-language browser automation plugin enabling AI agents to browse websites, interact with elements, and extract data. Available in TypeScript, Python, and Rust with full feature parity.
plugin-browser/
├── protocol/ # Shared protocol definitions (JSON Schema)
├── typescript/ # TypeScript/Node.js implementation
├── python/ # Python implementation
├── rust/ # Rust implementation
└── README.md # This file
All implementations support:
- Navigation: Navigate to URLs, go back/forward, refresh pages
- AI-Powered Interactions: Click, type, and select elements using natural language
- Data Extraction: Extract structured data from web pages
- Screenshots: Capture page screenshots
- CAPTCHA Solving: Automatic CAPTCHA solving (Turnstile, reCAPTCHA, hCaptcha)
- Session Management: Handle multiple browser sessions
- Security: URL validation, domain filtering, rate limiting
- Retry Logic: Exponential backoff for reliability
cd typescript
npm install
npm run buildimport { browserPlugin } from "@elizaos/plugin-browser";
const agent = {
plugins: [browserPlugin],
};cd python
pip install -e .from elizaos_browser import create_browser_plugin
plugin = create_browser_plugin()
await plugin.init()
await plugin.handle_action("BROWSER_NAVIGATE", "Go to google.com")cd rust
cargo builduse elizaos_browser::create_browser_plugin;
let mut plugin = create_browser_plugin(None);
plugin.init().await?;
plugin.handle_action("BROWSER_NAVIGATE", "Go to google.com").await?;All implementations support these actions:
| Action | Description | Example |
|---|---|---|
BROWSER_NAVIGATE |
Navigate to a URL | "Go to google.com" |
BROWSER_BACK |
Go back in history | "Go back" |
BROWSER_FORWARD |
Go forward in history | "Go forward" |
BROWSER_REFRESH |
Refresh the page | "Refresh the page" |
BROWSER_CLICK |
Click on an element | "Click the search button" |
BROWSER_TYPE |
Type text into a field | "Type 'hello' in the search box" |
BROWSER_SELECT |
Select dropdown option | "Select 'US' from the country dropdown" |
BROWSER_EXTRACT |
Extract data from page | "Extract the main heading" |
BROWSER_SCREENSHOT |
Take a screenshot | "Take a screenshot" |
BROWSER_SOLVE_CAPTCHA |
Solve CAPTCHA | "Solve the captcha" |
| Provider | Description |
|---|---|
BROWSER_STATE |
Current browser session state (URL, title, session info) |
All implementations use the same environment variables:
# Browser settings
BROWSER_HEADLESS=true
BROWSER_ENABLED=true
BROWSER_SERVER_PORT=3456
# Cloud browser (optional)
BROWSERBASE_API_KEY=your_api_key
BROWSERBASE_PROJECT_ID=your_project_id
# AI providers for intelligent interactions (optional)
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.2-vision
# CAPTCHA solving (optional)
CAPSOLVER_API_KEY=your_capsolver_keyThe protocol/ directory contains:
schema.json: JSON Schema defining all types, actions, and messagesREADME.md: Protocol documentation
All implementations communicate with the browser server using WebSocket and follow the same message protocol, ensuring interoperability.
All implementations use consistent error codes:
| Code | Description |
|---|---|
SERVICE_NOT_AVAILABLE |
Browser service not running |
SESSION_ERROR |
Session management error |
NAVIGATION_ERROR |
Page navigation failed |
ACTION_ERROR |
Browser action failed |
SECURITY_ERROR |
Security validation failed |
CAPTCHA_ERROR |
CAPTCHA solving failed |
TIMEOUT_ERROR |
Operation timed out |
- URL validation with domain allowlists/blocklists
- Input sanitization to prevent XSS/injection
- Rate limiting for actions and sessions
- Protocol restrictions (HTTP/HTTPS only by default)
cd typescript
npm install
npm run build
npm test
npm run typecheckcd python
pip install -e ".[dev]"
pytest
mypy elizaos_browser
ruff check elizaos_browsercd rust
cargo build
cargo test
cargo clippy
cargo fmtMIT
Built with:
- Stagehand - AI-first browser automation framework
- Playwright - Cross-browser automation
- CapSolver - CAPTCHA solving service