Documentation / Overview

WebPeel Documentation

Fast web fetching for AI agents. Smart escalation from HTTP to headless browser with stealth mode.

Quick Start

Get up and running with WebPeel in 30 seconds.

Installation

npm install -g webpeel

pnpm add -g webpeel

yarn global add webpeel

🐍 Python SDK — Alpha
pip install webpeel — v0.1.0 available on PyPI. Note: the Python SDK is significantly behind the Node.js SDK (v0.21.63). For the latest features, use the Node.js SDK or REST API.

First Fetch

# Fetch any URL
npx webpeel https://example.com

# Output as JSON
npx webpeel https://example.com --json

# Use browser rendering for JS-heavy sites
npx webpeel https://example.com --render

import { peel } from 'webpeel';

const result = await peel('https://example.com');
console.log(result.title);
console.log(result.content); // Markdown

Core Concepts

Smart Escalation

WebPeel automatically chooses the fastest method:

HTTP Fetch — Fast HTTP request with smart headers (~118ms)
Browser Mode — Headless browser for JavaScript-heavy sites (~2s)
Stealth Mode — Full anti-detection for protected sites (~5s)

// Simple fetch (HTTP)
const result1 = await peel('https://example.com');

// Force browser rendering
const result2 = await peel('https://example.com', { render: true });

// Use stealth mode for Cloudflare-protected sites
const result3 = await peel('https://example.com', { stealth: true });

Output Formats

Choose the format that works best for your use case:

const result = await peel('https://example.com', {
  format: 'markdown' // Default
});
// Perfect for LLMs — clean, structured content

const result = await peel('https://example.com', {
  format: 'text'
});
// Plain text — no formatting, just content

const result = await peel('https://example.com', {
  format: 'html'
});
// Raw HTML — full page source

Content Filtering

Extract only what you need:

// Extract specific content with CSS selector
const result = await peel('https://example.com', {
  selector: 'article',
  exclude: ['.sidebar', '.ads', 'nav']
});

// Only include main content tags
const result2 = await peel('https://example.com', {
  includeTags: ['article', 'main'],
  excludeTags: ['nav', 'footer', 'aside']
});

Advanced Features

Page Actions

Interact with pages before scraping:

const result = await peel('https://example.com', {
  actions: [
    { type: 'click', selector: '.load-more' },
    { type: 'wait', ms: 2000 },
    { type: 'scroll', to: 'bottom' },
    { type: 'type', selector: '#search', value: 'query' }
  ]
});

Structured Extraction

Extract data with CSS selectors or AI:

// CSS-based extraction
const result = await peel('https://example.com/product', {
  extract: {
    selectors: {
      title: 'h1',
      price: '.price',
      rating: '.rating'
    }
  }
});

// AI-powered extraction (BYOK)
const result2 = await peel('https://example.com', {
  extract: {
    prompt: 'Extract the main features as a list',
    llmApiKey: process.env.OPENAI_API_KEY
  }
});

Crawling

Crawl entire websites:

import { crawl } from 'webpeel';

const results = await crawl('https://example.com', {
  maxPages: 50,
  maxDepth: 2,
  excludePatterns: ['/admin/', '/login']
});

results.forEach(page => {
  console.log(page.title, page.url);
});

Change Tracking

Monitor content changes over time:

import { trackChange } from 'webpeel';

const change = await trackChange('https://example.com');
if (change.changed) {
  console.log('Content changed!');
  console.log('Added:', change.added);
  console.log('Removed:', change.removed);
}

Web Search (DuckDuckGo + Brave BYOK)

WebPeel includes web search out of the box:

DuckDuckGo — default, free, no key required
Brave Search — bring your own API key for higher-quality results

# DuckDuckGo (default)
npx webpeel search "latest AI news"

# Brave Search (BYOK)
npx webpeel config set braveApiKey $BRAVE_KEY
npx webpeel search "latest AI news" --provider brave

Cited Answers (`/v1/answer`)

Need a single, cited answer instead of raw pages? Use the Answer endpoint (search + fetch + LLM → response with [1], [2] citations). BYOK — use your own LLM key.

npx webpeel answer "What is MCP?" \
  --llm openai \
  --llm-api-key $OPENAI_API_KEY \
  --max-sources 5

API Endpoints

WebPeel exposes 55+ endpoints across 8 feature areas:

📦

Batch Scraping

Scrape up to 100 URLs concurrently with SSE streaming or async polling

🔍

Structured Extraction

Extract typed JSON from any page using LLM + JSON Schema (BYOK)

🤖

Research Agents

Autonomous multi-step web research with LLM synthesis

👀

Page Monitoring

Watch URLs for content changes and receive webhook notifications

📸

Screenshots

Capture pages at any viewport, run visual diffs, and audit designs

🔎

Search

Web, news, and image search with optional result scraping

🕷️

Crawl & Map

Recursively crawl domains or map URL structure. Firecrawl compatible.

⚡

Deep Fetch

Search + fetch + synthesize multiple sources in one API call

Next Steps

📖

API Reference

Complete REST API documentation with all endpoints and parameters

⚡

SDKs

Node.js and Python client libraries with full code examples

💻

CLI

Command-line interface with 18+ commands for web scraping

🤖

MCP Server

Set up WebPeel for Claude, Cursor, and VS Code

New in v0.21

Unique features that no other web understanding engine has. No API keys, no extra dependencies.

🧩

Structured Extraction

LLM-powered JSON extraction from any page using JSON Schema or natural language. BYOK. Also includes zero-LLM heuristic auto-extraction with confidence scoring.

▶️