RAW HTTP — NO HEADLESS BROWSER OVERHEADMARKDOWN · JSON · HTML · LLM-READY FORMATSMCP SERVER FOR AI AGENTSTLS FINGERPRINT IMPERSONATIONEXTRACT · SUMMARIZE · DIFF · BRANDSITEMAP DISCOVERY & DEEP CRAWLINGSELF-HOST OR USE OUR CLOUD APIBUILT IN RUST — FAST BY DEFAULTDEEP RESEARCH — AI SYNTHESIZES REPORTS FROM 50+ SOURCESWEB SEARCH — QUERY AND SCRAPE SEARCH RESULTS IN ONE CALLAGENT SCRAPE — GIVE A GOAL, AI EXTRACTS WHAT YOU NEEDURL MONITORING — WATCH PAGES FOR CHANGES WITH WEBHOOKSBONUS CREDITS — EARN FREE CREDITS BY STARRING AND REFERRINGRAW HTTP — NO HEADLESS BROWSER OVERHEADMARKDOWN · JSON · HTML · LLM-READY FORMATSMCP SERVER FOR AI AGENTSTLS FINGERPRINT IMPERSONATIONEXTRACT · SUMMARIZE · DIFF · BRANDSITEMAP DISCOVERY & DEEP CRAWLINGSELF-HOST OR USE OUR CLOUD APIBUILT IN RUST — FAST BY DEFAULTDEEP RESEARCH — AI SYNTHESIZES REPORTS FROM 50+ SOURCESWEB SEARCH — QUERY AND SCRAPE SEARCH RESULTS IN ONE CALLAGENT SCRAPE — GIVE A GOAL, AI EXTRACTS WHAT YOU NEEDURL MONITORING — WATCH PAGES FOR CHANGES WITH WEBHOOKSBONUS CREDITS — EARN FREE CREDITS BY STARRING AND REFERRING

The extraction engine

Every page.
Every format.
One claw.

Turn any website into LLM-ready markdown, JSON, or structured data. No browser. No Selenium. Pure HTTP speed.

One command setup

MCP + CLI

Give your AI agents web data with a single command. Auto-detects your tools and configures everything.

Learn more
Terminal
$npx create-webclaw

Works with Claude Code, Cursor,
Windsurf, Codex, OpenCode, and more

Try it live

Paste any URL

0ms
avg extraction
0%
success rate
0%
token reduction
0
API endpoints

Every page.
Every defense.

01

Fast by default. Smart when needed.

118ms average for static pages. Multi-layer rendering pipeline for JS-heavy sites. You don't configure anything — the engine picks the fastest path automatically.

02

Best-in-class bot protection.

Challenge pages, CAPTCHAs, browser fingerprinting — handled transparently. No manual cookies, no config. Your requests just work, even on the hardest sites.

03

Agentic scraping.

Give a goal, get structured data. The AI agent reasons about page content, clicks buttons, navigates, and extracts exactly what you asked for. Powered by the best available models.

04

Every format, every extraction.

Markdown, JSON, plain text, LLM-optimized. Schema-based extraction, prompt-based extraction, summarization, brand identity, content diffing. 14 endpoints, one API key.

05

Built for AI agents.

MCP server with 12 tools for Claude, Cursor, Windsurf, OpenCode, Codex, Antigravity, and any MCP client. REST API for everything else. Web search, batch processing, crawling, sitemap discovery.

06

Documents, screenshots, mobile.

Auto-detects PDFs, DOCX, XLSX, CSV. Take full-page screenshots. Mobile emulation for cleaner layouts. Browser actions — click, type, scroll, wait — before extraction.

07

Firecrawl compatible.

Drop-in /v2 endpoints. Change your base URL, keep your existing SDK code. Same API shape, better extraction quality, faster response times.

08

Deep content recovery.

Embedded JSON, structured data, server-rendered payloads — extracted even when the visible DOM is empty. Multiple fallback strategies ensure nothing gets missed. If the content exists, webclaw finds it.

One credit.
One page.

No hidden multipliers. No per-feature charges. Pick a plan, start extracting.

FREE
$0/mo
PAGES················································································500/mo
CONCURRENCY················································································2
ANTIBOT················································································
JS RENDER················································································
LLM CALLS················································································
RESEARCH················································································
PROXY················································································
SUPPORT················································································COMMUNITY
GET STARTED
STARTER
$49/mo
PAGES················································································10,000/mo
CONCURRENCY················································································10
ANTIBOT················································································
JS RENDER················································································
LLM CALLS················································································
RESEARCH················································································5/mo
PROXY················································································
SUPPORT················································································EMAIL
JOIN WAITLIST
PROPOPULAR
$99/mo
PAGES················································································100,000/mo
CONCURRENCY················································································50
ANTIBOT················································································500/mo
JS RENDER················································································2,000/mo
LLM CALLS················································································1,000/mo
RESEARCH················································································25/mo
PROXY················································································2 GB
SUPPORT················································································PRIORITY
JOIN WAITLIST
SCALE
$399/mo
PAGES················································································500,000/mo
CONCURRENCY················································································100
ANTIBOT················································································5,000/mo
JS RENDER················································································10,000/mo
LLM CALLS················································································10,000/mo
RESEARCH················································································100/mo
PROXY················································································10 GB
SUPPORT················································································PRIORITY + SLACK
JOIN WAITLIST
DEDICATED

Unlimited pages. Unlimited research. 200 concurrent. Single-tenant on your cloud, your proxies, your rules. Dedicated Slack channel + SLA.

CONTACT US
OPEN SOURCE

Self-host forever. AGPL-3.0 license. CLI + server + MCP server. No limits on your hardware.

VIEW ON GITHUB

1 CREDIT = 1 PAGE, ALWAYS · NO HIDDEN MULTIPLIERS · OPEN SOURCE

Common questions

FAQ

Webclaw is a web extraction toolkit that converts any website into clean, structured data. It supports multiple output formats — Markdown, JSON, HTML, plain text, and an LLM-optimized format that strips noise and reduces token count by up to 67%.

Webclaw uses raw HTTP requests with TLS fingerprint impersonation instead of spinning up a headless browser. This means sub-200ms response times, zero browser overhead, and no Selenium or Playwright dependency. It achieves the same results through intelligent content extraction and readability scoring.

Yes. The Starter plan is completely free — 500 pages per month, 5 output formats, sitemap discovery, and full API access. No credit card required. You can upgrade anytime if you need higher limits or advanced features like LLM extraction.

Absolutely. Webclaw is open source under the AGPL-3.0 license. You can run the CLI, REST API server, or MCP server on your own infrastructure. Docker images and one-line deploy scripts are available for quick setup.

Six formats: Markdown (clean readable text), JSON (structured with metadata), HTML (sanitized), plain text, LLM-optimized (stripped of noise for AI consumption), and raw HTML. The LLM format runs a 9-step optimization pipeline to minimize token usage.

Webclaw ships a dedicated MCP (Model Context Protocol) server binary that exposes 8 tools — scrape, crawl, map, batch, extract, summarize, diff, and brand. It works with any MCP-compatible client like Claude Desktop, Claude Code, Cursor, Windsurf, OpenCode, Codex, or Antigravity over stdio transport.

Your extracted content is never stored or logged on our servers. Requests are processed in real-time and the response is returned directly to you. If you use LLM features, content is sent to the AI provider for processing but is not retained. For full control, self-host the entire stack.

Webclaw can use language models to extract structured JSON from pages using a schema you define, answer questions about page content with prompt-based extraction, or generate summaries. It chains through local Ollama first, then falls back to cloud providers.

Ready to build?

Start extracting.

Free tier. No credit card. Deploy in under a minute — or self-host forever. Open source.

Stay in the loop

Get notified when the webclaw API launches. Early subscribers get extended free tier access.

No spam. Unsubscribe anytime.