A curated list of tools, frameworks, and resources for building AI agents that can browse and interact with the web.
Steel is an open-source browser API built specifically for AI agents. We make it easy to build AI applications that can effectively interact with the web.
β¨ Get started for free here.
AI agents that autonomously navigate and interact with the web through a user-friendly interface. (a.k.a Browser Agents)
- Surf.new - An open-source playground for chatting with different web agents.
- OpenAI Operator - OpenAI's AI agents that can browse the web for you.
- Browser-Use - SOTA agent and framework that makes the web LLM-friendly.
- Skyvern-AI - Framework to automate browser-based workflows.
- Google Project Mariner - A research prototype exploring the future of human-agent interaction, starting with your browser.
- Sentience API - A tool for building more deterministic and explainable web agents using semantic geometry on web content.
- Runner H - State-of-the-art AI agent that helps automate complex, cumbersome, multi-step tasks without repetitive manual input.
- WebVoyager (Agent) - Vision-enabled web agent.
- AgentGPT - Deploy autonomous AI agents in your browser.
- Agent-E - Agent & framework with HTML DOM distillation.
- Manus - A general AI agent that can execute long running tasks across tools like browsers, terminals, and text editors.
- doBrowser - An AI-powered Chrome extension that understands natural language and takes actions in your browser on your behalf.
- WebSurfer (Autogen) - MultimodalWebSurfer is a multimodal agent that can search the web and visit web pages.
- Magentic-One - A generalist multi-agent system for solving complex tasks including surfing the web via Autogen's MultimodalWebSurfer.
- Harpa.ai - An AI-powered Chrome extension & browser agent that understands natural language and takes actions on your behalf.
- Yutori - A multi-agent system that executes browser-based tasks in parallel given a natural language prompt.
- Automina - AI browser automation tool with natural language control.
- rtrvr.ai - AI web agent Chrome extension that autonomously does tasks, scrapes to Sheets, and calls APIs with prompts in your own browser.
- Nanobrowser - An open-source & local-first AI web agent Chrome extension with flexible LLM options and multi-agent system.
- Browserable - An open-source & self-hostable browser automation library for AI agents.
- Tongyi WebAgent - WebAgent for information seeking built by Tongyi Lab, Alibaba Group.
- Openwork - An MIT-licensed, open alternative to Anthropic's Cowork built with Opencode and dev-browser. Supports multiple LLM providers for launching computer-use agents to automate browser workflows.
- Dassi - An AI coworking agent in your browser that automates tasks, navigates pages, and works with files and 2000+ apps from a side panel.
- Anthropic Computer Use - Computer use agent that can control your browser.
- Self-Operating Computer Framework - A framework to enable multimodal models to operate a computer.
- Highlight - Desktop activity layer that helps models understand your workflow and complete tasks faster.
- OpenInterpreter - An open-source CLI based agent that can write & execute code as well as control your browser.
- UI-TARS - A GUI agent model designed to interact seamlessly with GUIs using human-like perception, reasoning, and action capabilities.
Tools, frameworks and libraries that translate natural language instructions into web interactions.
- Asteroid.ai - Hosted browser agents for SMEs to automate complex workflows.
- PulsarRPA - AI-powered browser automation for data extraction.
- VimGPT - Experimental project using GPT-4 Vision to browse the web via the Vimium extension.
- Cekura.io - An AI browser agent that helps companies maintain up-to-date documentation.
- Dex by Dexterity - An AI coworker embedding into and controlling your browser.
- Autobrowser - A free, experimental Chrome extension that leverages Claude Computer Use to automate tasks in your browser.
- Bytebot - AI-powered scraping automations that evolve with your target sites.
- Runcopycat - A no-code browser automation platform that turns screen recordings into reusable automated workflows.
- Bardeen.ai - A Chrome extension that enables AI-powered browser automations, allowing users to automate tasks and workflows directly within the browser.
- Starizon.ai - Browser assistant for web task automation.
- BrowserGPT - Browser extension for page summaries and Q&A.
- Browse.ai - Chrome extension webscraping that can leverage AI for structured data extraction.
- Strawberry Browser - A personal assistant that sits in your browser, automates repetitive web actions, learns your workflows.
- Deta.surf - An integrated platform that combines a browser, file manager, and AI assistant with browser-level context.
- Comet by Perplexity - An AI-powered browser by Perplexity. Not much more details out yet.
- Dia Browser - AI-first web browser envisioned by The Browser Company (Arc).
- Reworkd - No-code web data extraction solution using agentic AI.
- Onpiste - Chrome extension that uses AI to control and read webpages, including auto summaries, web automation, scraping, and MCP support.
- Steel.dev - Open-source headless browser API built specifically for AI agents and apps.
- Omniparser - Tool for parsing GUIs for vision based agents.
- LaVague - Framework for natural language web automation.
- LangChain Playwright Toolkit - Toolkit integration with AI agents.
- Browserbase - A headless browser API for AI workflows.
- Stagehand - AI web browsing framework.
- Tarsier - Vision utilities library for web interaction agents.
- AutoGPT - Experimental agent for task completion and web browsing.
- TinyFish - Remote web agents that execute tasks on any website and return structured JSON via a single API call.
- Bytebot - Containerized computer use agent framework with a virtual desktop environment.
- Lumen - Vision-first browser agent with self-healing deterministic replay. Screenshot β model β action loop over CDP, multi-provider (Anthropic, Google, OpenAI), action caching for zero-token reruns.
- BabelWrap - HTTP API and MCP server that lets AI agents interact with websites through natural language instead of CSS selectors.
Web crawlers & scrapers that leverage AI to navigate websites and extract content.
- FireCrawl - APIs for turning websites into LLM-friendly markdown.
- Crawl4AI - Open-source LLM Friendly Web Crawler & Scraper.
- ScrapeGraphAI - Python scraper based on AI.
- WebAgent (OpenAgents) - The web-browsing agent module of the OpenAgents platform (HKU). Enables autonomous navigation of websites via natural language, as part of a larger multi-modal agent framework.
- Expand.ai - Turns any website into a type-safe API you can rely on.
- LLM Scraper - Uses LLMs for intelligent scraping and content understanding.
- Plasmate - Open-source headless browser engine for AI agents. Compiles HTML to Semantic Object Model (SOM) with 17.5x token compression. 13 MCP tools. First browser tool on the MCP Registry. Rust, Apache-2.0.
- SpiderCreator - Create complex Playwright spiders with natural language prompts.
Utilities that help agents search the web or query web data via natural language.
- AgentQL - A query language and toolkit that makes the web AI-ready.
- SerpAPI - Search API that provides Google Search results for your agents.
- Serper.dev - Performant and cost effective search API that provides Google Search results for your agents.
- Jina.ai - Neural search platform for web data.
- Exa.ai - The fastest and most accurate web search API for AI agents.
- Not Human Search - Search engine that indexes 1,750+ agent-first tools ranked by agentic readiness. Available as an MCP server with tools for searching, scoring, and monitoring agent infrastructure.
Datasets, benchmarks, and notable research efforts for evaluating and advancing web-capable AI agents.
- Web Agent Leaderboard - Leaderboard compiling AI agent products and their performance on widely used WebVoyager benchmarks.
- Web Games by Convergence - A collection of challenges designed for testing general-purpose web-browsing AI agents.
- Bananalyzer - An open-source evaluation framework for web-based AI agents.
- Mind2Web - A large-scale dataset for generalist web agents.
- World of Bits: An Open-Domain Platform for Web-Based Agents - OpenAI's research paper that introduces World of Bits: a platform where agents complete tasks on the internet by performing low-level keyboard and mouse actions.
- MiniWoB++ - A classic suite of 104 mini web browser tasks in a synthetic environment. It is an extension of the OpenAI MiniWoB benchmark.
- WebTaskBench - 51-URL benchmark comparing HTML vs Markdown vs SOM representations for AI agents. Measures token efficiency, latency, and accuracy across GPT-4o and Claude Sonnet 4.
- WebArena - A realistic, self-hostable web environment for autonomous agents. Includes official leaderboard tracking agent performance.
- WebCanvas - An online evaluation framework for dynamic web environments. Tests agents on live websites.
- WebGPT - OpenAI's browser-assisted question-answering research project.
- WebShop - A simulated e-commerce shopping environment with 1.18M real Amazon products.
- WebVoyager (Benchmark) - Vision-enabled benchmark for real-world website interaction with large multimodal models.
- WorkArena - A suite of 33 browser-based tasks for enterprise "knowledge worker" scenarios.
- BrowserGym by ServiceNow - A gym environment for web task automation.
- TimeWarp - A benchmark on historical versions of web UI.
- ClawBench - 153 everyday tasks on 144 live production websites across 15 categories, with a submission-interception layer that blocks only the final write request to preserve real-site behavior without side effects.
Resources for learning how to build, deploy, or utilize AI web agents.
- LangGraph WebVoyager Tutorial - Tutorial demonstrating how to build a web navigation agent using LangGraph Agents, Vision Models, and Web Voyager.
- Build an AI Browser Agent - Step-by-step guide to create an AI that browses the web using Playwright and the Browser-Use library.
- Install & Run Browser-Use Locally - Instructions on installing the open-source Browser-Use agent with a local LLM.
- Build a Browser Agent with DeepSeek - Walks through deploying a Browser-Use web UI agent powered by the DeepSeek model on a cloud VM.
Historical or inactive projects are tracked in ARCHIVE.md.
Feel free to reach out at [email protected] or on Discord.
Steel is an open-source browser API built specifically for AI agents. Get started for free here.
- Follow @steeldotdev on X.
- Join the Discord community.
- Feel free to reach out to us at [email protected]
Contributions of any kind welcome, just follow the guidelines!