WebPeel Documentation
Fast web fetching for AI agents. Smart escalation from HTTP to headless browser with stealth mode.
Quick Start
Get up and running with WebPeel in 30 seconds.
Installation
npm install -g webpeel
pnpm add -g webpeel
yarn global add webpeel
pip install webpeel — v0.1.0 available on PyPI. Note: the Python SDK is significantly behind the Node.js SDK (v0.21.63). For the latest features, use the Node.js SDK or REST API.
First Fetch
# Fetch any URL
npx webpeel https://example.com
# Output as JSON
npx webpeel https://example.com --json
# Use browser rendering for JS-heavy sites
npx webpeel https://example.com --render
import { peel } from 'webpeel';
const result = await peel('https://example.com');
console.log(result.title);
console.log(result.content); // Markdown
pip install webpeel — v0.1.0 available on PyPI. Note: the Python SDK is significantly behind the Node.js SDK (v0.21.63). For the latest features, use the Node.js SDK or REST API.
Core Concepts
Smart Escalation
WebPeel automatically chooses the fastest method:
- HTTP Fetch — Fast HTTP request with smart headers (~118ms)
- Browser Mode — Headless browser for JavaScript-heavy sites (~2s)
- Stealth Mode — Full anti-detection for protected sites (~5s)
// Simple fetch (HTTP)
const result1 = await peel('https://example.com');
// Force browser rendering
const result2 = await peel('https://example.com', { render: true });
// Use stealth mode for Cloudflare-protected sites
const result3 = await peel('https://example.com', { stealth: true });
Output Formats
Choose the format that works best for your use case:
const result = await peel('https://example.com', {
format: 'markdown' // Default
});
// Perfect for LLMs — clean, structured content
const result = await peel('https://example.com', {
format: 'text'
});
// Plain text — no formatting, just content
const result = await peel('https://example.com', {
format: 'html'
});
// Raw HTML — full page source
Content Filtering
Extract only what you need:
// Extract specific content with CSS selector
const result = await peel('https://example.com', {
selector: 'article',
exclude: ['.sidebar', '.ads', 'nav']
});
// Only include main content tags
const result2 = await peel('https://example.com', {
includeTags: ['article', 'main'],
excludeTags: ['nav', 'footer', 'aside']
});
Advanced Features
Page Actions
Interact with pages before scraping:
const result = await peel('https://example.com', {
actions: [
{ type: 'click', selector: '.load-more' },
{ type: 'wait', ms: 2000 },
{ type: 'scroll', to: 'bottom' },
{ type: 'type', selector: '#search', value: 'query' }
]
});
Structured Extraction
Extract data with CSS selectors or AI:
// CSS-based extraction
const result = await peel('https://example.com/product', {
extract: {
selectors: {
title: 'h1',
price: '.price',
rating: '.rating'
}
}
});
// AI-powered extraction (BYOK)
const result2 = await peel('https://example.com', {
extract: {
prompt: 'Extract the main features as a list',
llmApiKey: process.env.OPENAI_API_KEY
}
});
Crawling
Crawl entire websites:
import { crawl } from 'webpeel';
const results = await crawl('https://example.com', {
maxPages: 50,
maxDepth: 2,
excludePatterns: ['/admin/', '/login']
});
results.forEach(page => {
console.log(page.title, page.url);
});
Change Tracking
Monitor content changes over time:
import { trackChange } from 'webpeel';
const change = await trackChange('https://example.com');
if (change.changed) {
console.log('Content changed!');
console.log('Added:', change.added);
console.log('Removed:', change.removed);
}
Web Search (DuckDuckGo + Brave BYOK)
WebPeel includes web search out of the box:
- DuckDuckGo — default, free, no key required
- Brave Search — bring your own API key for higher-quality results
# DuckDuckGo (default)
npx webpeel search "latest AI news"
# Brave Search (BYOK)
npx webpeel config set braveApiKey $BRAVE_KEY
npx webpeel search "latest AI news" --provider brave
Cited Answers (/v1/answer)
Need a single, cited answer instead of raw pages? Use the Answer endpoint (search + fetch + LLM → response with [1], [2] citations). BYOK — use your own LLM key.
npx webpeel answer "What is MCP?" \
--llm openai \
--llm-api-key $OPENAI_API_KEY \
--max-sources 5
API Endpoints
WebPeel exposes 55+ endpoints across 8 feature areas:
Batch Scraping
Scrape up to 100 URLs concurrently with SSE streaming or async polling
Structured Extraction
Extract typed JSON from any page using LLM + JSON Schema (BYOK)
Research Agents
Autonomous multi-step web research with LLM synthesis
Page Monitoring
Watch URLs for content changes and receive webhook notifications
Screenshots
Capture pages at any viewport, run visual diffs, and audit designs
Search
Web, news, and image search with optional result scraping
Crawl & Map
Recursively crawl domains or map URL structure. Firecrawl compatible.
Deep Fetch
Search + fetch + synthesize multiple sources in one API call
Next Steps
API Reference
Complete REST API documentation with all endpoints and parameters
SDKs
Node.js and Python client libraries with full code examples
CLI
Command-line interface with 18+ commands for web scraping
MCP Server
Set up WebPeel for Claude, Cursor, and VS Code
New in v0.21
Unique features that no other web understanding engine has. No API keys, no extra dependencies.
Structured Extraction
LLM-powered JSON extraction from any page using JSON Schema or natural language. BYOK. Also includes zero-LLM heuristic auto-extraction with confidence scoring.
YouTube Transcripts + Export
Extract transcripts from any YouTube video and export as SRT, TXT, Markdown, or JSON. No API key needed.
LLM-Free Q&A
Ask questions about any page using BM25 scoring. Free, instant, no LLM API key required.
Domain Extractors
Structured data from 32 domains including Twitter/X, Reddit, GitHub, YouTube, Amazon, and more — automatic, zero config.
Reader Mode
Strip all page noise and get clean Markdown with title, author, date, and reading time.