Documentation / CLI

CLI Reference

Command-line interface for WebPeel. 20+ commands for web scraping, searching, crawling, and hotel search.

Installation

npm install -g webpeel

Quick Start

# Fetch a URL
npx webpeel https://example.com

# Output as JSON
npx webpeel https://example.com --json

# Use browser rendering
npx webpeel https://example.com --render

# Search the web (DuckDuckGo default)
npx webpeel search "AI news"

# Search with Brave (BYOK)
npx webpeel search "AI news" --provider brave --search-api-key $BRAVE_KEY

# Get a cited answer (BYOK)
npx webpeel answer "What is MCP?" --llm openai --llm-api-key $OPENAI_API_KEY

# v0.10.0: CSS schema extraction (auto-detected by domain)
npx webpeel https://www.amazon.com/s?k=keyboard --json
npx webpeel https://news.ycombinator.com --schema hn --json
npx webpeel --list-schemas

# v0.10.0: LLM extraction (BYOK)
npx webpeel https://example.com/product --llm-extract "title, price, rating" --json

# v0.10.0: Hotel search
npx webpeel hotels "Paris" --checkin 2026-03-01 --checkout 2026-03-05 --sort price

# v0.10.0: Browser profiles
npx webpeel profile create myprofile
npx webpeel https://example.com --profile myprofile --stealth

Global Options

Option	Description
`-s, --silent`	Silent mode (no spinner, no progress)
`--json`	Output as JSON
`-h, --help`	Show help
`-V, --version`	Show version number

Commands

`webpeel <url>`

Fetch and extract content from a URL.

# Basic usage
npx webpeel https://example.com

# Output formats
npx webpeel https://example.com --html
npx webpeel https://example.com --text
npx webpeel https://example.com --json

# Browser rendering
npx webpeel https://example.com --render
npx webpeel https://example.com --stealth
npx webpeel https://example.com --wait 5000

# Content filtering
npx webpeel https://example.com --selector "article"
npx webpeel https://example.com --exclude ".sidebar" ".ads"
npx webpeel https://example.com --only-main-content

# Advanced options
npx webpeel https://example.com --screenshot
npx webpeel https://example.com --screenshot screenshot.png
npx webpeel https://example.com --full-page
npx webpeel https://example.com --max-tokens 5000
npx webpeel https://example.com --cache 1h

# v0.9.0: Agent mode (JSON + budget + extraction + silent)
npx webpeel https://example.com --agent

# v0.9.0: Auto-extract listings (product cards, search results)
npx webpeel https://example.com --extract-all
npx webpeel https://example.com --extract-all --table
npx webpeel https://example.com --extract-all --csv

# v0.9.0: Token budget
npx webpeel https://example.com --budget 4000

# v0.10.0: CSS schema extraction
npx webpeel https://www.amazon.com/s?k=laptop --schema amazon --json
npx webpeel --list-schemas

# v0.10.0: LLM extraction (BYOK — uses OPENAI_API_KEY / WEBPEEL_LLM_BASE_URL)
npx webpeel https://example.com/product --llm-extract "title, price, in-stock" --json
npx webpeel https://example.com --llm-extract --json

# v0.10.0: Browser profile
npx webpeel https://example.com --profile myprofile

Options

Option	Type	Description
`--html`	boolean	Output raw HTML
`--text`	boolean	Output plain text
`-r, --render`	boolean	Use headless browser
`--stealth`	boolean	Use stealth mode
`-w, --wait <ms>`	number	Wait time after page load (ms)
`--selector <css>`	string	CSS selector to extract
`--exclude <selectors...>`	string[]	CSS selectors to exclude
`--include-tags <tags>`	string	Comma-separated tags to include
`--exclude-tags <tags>`	string	Comma-separated tags to exclude
`--only-main-content`	boolean	Extract only main/article content
`--screenshot [path]`	string	Capture screenshot (optional path)
`--full-page`	boolean	Full-page screenshot
`--links`	boolean	Output only links
`--images`	boolean	Output image URLs
`--meta`	boolean	Output only metadata
`--max-tokens <n>`	number	Max token count (truncate)
`--cache <ttl>`	string	Cache TTL (e.g., "5m", "1h", "1d")
`--location <country>`	string	ISO country code
`--language <lang>`	string	Language preference
`--agent`	boolean	Agent mode: sets --json, --silent, --extract-all, --budget 4000 v0.9.0
`--extract-all`	boolean	Auto-detect and extract repeated listings (product cards, search results) v0.9.0
`--budget <n>`	number	Smart token budget — distill content to fit within N tokens v0.9.0
`--table`	boolean	Output extracted listings as a formatted table v0.9.0
`--csv`	boolean	Output extracted listings as CSV v0.9.0
`--pages <n>`	number	Multi-page pagination (1-10) v0.9.0
`--scroll-extract [n]`	number	Infinite scroll extraction v0.9.0
`--schema <name>`	string	Apply a CSS extraction schema (e.g. `amazon`, `booking`, `ebay`, `yelp`, `walmart`, `hn`). Auto-detected by domain if omitted. v0.10.0
`--list-schemas`	boolean	List all bundled CSS extraction schemas and exit v0.10.0
`--llm-extract [instruction]`	string	Extract structured data using an LLM (BYOK). Optional natural-language instruction, e.g. `"title, price, rating"`. Uses `WEBPEEL_LLM_BASE_URL` / `OPENAI_API_KEY`. v0.10.0
`--profile <name>`	string	Load a saved browser profile (cookies, localStorage, auth). Create profiles with `webpeel profile create`. v0.10.0

`webpeel search <query>`

Search the web using DuckDuckGo (default, free) or Brave Search (BYOK). New in v0.9.0: use --site to search specific sites with auto-extraction.

# Basic search (DuckDuckGo)
npx webpeel search "AI news"

# Use Brave Search (BYOK)
npx webpeel search "AI news" --provider brave --search-api-key $BRAVE_KEY

# Or set once in config
npx webpeel config set braveApiKey $BRAVE_KEY
npx webpeel search "AI news" --provider brave

# Limit results
npx webpeel search "python tutorials" -n 10

# JSON output
npx webpeel search "web scraping" --json

# v0.9.0: Site-specific search (no URL knowledge needed!)
npx webpeel search --site ebay "charizard card"
npx webpeel search --site amazon "mechanical keyboard" --json
npx webpeel search --site walmart "ps5" --table
npx webpeel search --site github "web scraper typescript"

# List all supported sites
npx webpeel sites

Options

Option	Description
`-n, --count <n>`	Number of results (1-10, default: 5)
`--provider <provider>`	Search provider: `duckduckgo` (default) or `brave`
`--search-api-key <key>`	Brave Search API key (or env `WEBPEEL_BRAVE_API_KEY`)
`--site <site>`	Search a specific site (e.g. `ebay`, `amazon`, `github`). Run `webpeel sites` for full list. v0.9.0
`--json`	Output as JSON
`--table`	Output site-search results as a formatted table v0.9.0
`--csv`	Output site-search results as CSV v0.9.0
`--profile <name>`	Load a saved browser profile for site-aware searches v0.10.0
`-s, --silent`	Silent mode

`webpeel sites` v0.9.0

List all supported sites for site-aware search. 27 sites across 7 categories: shopping, general, social, tech, jobs, real-estate, food.

# List all sites
npx webpeel sites

# Filter by category
npx webpeel sites --category shopping

# JSON output
npx webpeel sites --json

`webpeel answer <question>`

Ask a question, search the web, fetch sources, and get an AI-generated answer with citations. BYOK — bring your own LLM API key.

# Answer with citations (DuckDuckGo search by default)
npx webpeel answer "What is MCP?" --llm openai --llm-api-key $OPENAI_API_KEY

# Use Brave Search for higher-quality results (BYOK)
npx webpeel answer "What is MCP?" \
  --provider brave \
  --search-api-key $BRAVE_KEY \
  --llm anthropic \
  --llm-api-key $ANTHROPIC_API_KEY \
  --max-sources 5

# JSON output
npx webpeel answer "Compare DuckDuckGo vs Brave Search" \
  --llm openai --llm-api-key $OPENAI_API_KEY \
  --json

Options

Option	Description
`--provider <provider>`	Search provider: `duckduckgo` (default) or `brave`
`--search-api-key <key>`	Brave Search API key (or env `WEBPEEL_BRAVE_API_KEY`)
`--llm <provider>`	LLM provider: `openai`, `anthropic`, or `google`
`--llm-api-key <key>`	LLM API key (or env `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` / `GOOGLE_API_KEY`)
`--llm-model <model>`	Optional LLM model name (provider-specific)
`--max-sources <n>`	Maximum sources to fetch (1-10, default: 5)
`--json`	Output as JSON
`-s, --silent`	Silent mode

`webpeel batch [file]`

Fetch multiple URLs from file or stdin.

# From file
npx webpeel batch urls.txt

# From stdin
cat urls.txt | npx webpeel batch

# With concurrency
npx webpeel batch urls.txt -c 5

# Save to directory
npx webpeel batch urls.txt -o output/

# JSON output
npx webpeel batch urls.txt --json

Options

Option	Description
`-c, --concurrency <n>`	Max concurrent fetches (default: 3)
`-o, --output <dir>`	Output directory (one file per URL)
`--json`	Output as JSON array
`-r, --render`	Use browser rendering for all
`--selector <css>`	CSS selector to extract

`webpeel crawl <url>`

Crawl a website recursively.

# Basic crawl
npx webpeel crawl https://example.com

# Limit pages and depth
npx webpeel crawl https://example.com --max-pages 100 --max-depth 3

# Exclude patterns
npx webpeel crawl https://example.com --exclude "/admin/" "/login"

# Domain restrictions
npx webpeel crawl https://example.com --allowed-domains "example.com" "docs.example.com"

# Ignore robots.txt
npx webpeel crawl https://example.com --ignore-robots

# With browser rendering
npx webpeel crawl https://example.com --render --stealth

Options

Option	Description
`--max-pages <n>`	Max pages to crawl (default: 10, max: 100)
`--max-depth <n>`	Max depth (default: 2, max: 5)
`--allowed-domains <domains...>`	Only crawl these domains
`--exclude <patterns...>`	Exclude URL patterns (regex)
`--ignore-robots`	Ignore robots.txt
`--rate-limit <ms>`	Rate limit (default: 1000ms)
`-r, --render`	Use browser rendering
`--stealth`	Use stealth mode

`webpeel map <url>`

Discover all URLs on a domain.

# Map a domain
npx webpeel map https://example.com

# Skip sitemap
npx webpeel map https://example.com --no-sitemap

# Skip crawl
npx webpeel map https://example.com --no-crawl

# Limit results
npx webpeel map https://example.com --max 1000

# Filter patterns
npx webpeel map https://example.com --include "/docs/" --exclude "/api/"

Options

Option	Description
`--no-sitemap`	Skip sitemap.xml discovery
`--no-crawl`	Skip homepage crawl
`--max <n>`	Max URLs (default: 5000)
`--include <patterns...>`	Include URL patterns (regex)
`--exclude <patterns...>`	Exclude URL patterns (regex)

`webpeel login`

Authenticate CLI with API key.

npx webpeel login
# Prompts for API key, saves to ~/.webpeel/config.json

`webpeel logout`

Clear saved credentials.

npx webpeel logout

`webpeel whoami`

Show authentication status.

npx webpeel whoami
# Logged in with API key: sk-1234...abcd
# Plan: Pro
# Config: ~/.webpeel/config.json

`webpeel usage`

Show usage and quota.

npx webpeel usage
# Free: 45/125 fetches this week (36%)
# Resets: Mon Feb 17 2026 00:00:00 EST

`webpeel config [action] [key] [value]`

View or update CLI configuration (stored in ~/.webpeel/config.json).

# Show overview
npx webpeel config

# Get specific key
npx webpeel config get braveApiKey

# Set Brave Search API key (BYOK)
npx webpeel config set braveApiKey $BRAVE_KEY

`webpeel cache <action>`

Manage local response cache.

# Show cache stats
npx webpeel cache stats

# Clear expired entries
npx webpeel cache clear

# Purge all entries
npx webpeel cache purge

`webpeel serve`

Start API server.

# Default port 3000
npx webpeel serve

# Custom port
npx webpeel serve -p 8080

`webpeel mcp`

Start MCP server for Claude/Cursor.

npx webpeel mcp
# Runs on stdio for Claude Desktop / Cursor integration

`webpeel research <query>`

Multi-source deep research with BM25 relevance ranking. No LLM key needed for sources mode.

# Get ranked sources with relevance scores
npx webpeel research "best web scraping tools 2025" --max-sources 5

# Full synthesis with LLM
npx webpeel research "compare Firecrawl vs Crawl4AI" --llm-key $OPENAI_API_KEY

# JSON output
npx webpeel research "AI trends" --format sources --json

Flag	Description	Default
`--max-sources`	Max sources to fetch	5
`--max-depth`	Link-following depth (1=no follow)	2
`--format`	`report` (LLM synthesis) or `sources` (raw)	report
`--timeout`	Total timeout in ms	60000
`--llm-key`	OpenAI API key for synthesis	—
`--llm-model`	Model for synthesis	gpt-4o-mini

Token Efficiency Flags

Save 15-77% on AI tokens. These flags work with any fetch command.

# BM25 query-focused filtering — keep only relevant content
npx webpeel https://en.wikipedia.org/wiki/AI --focus "machine learning"

# Smart chunking — split into LLM-friendly pieces
npx webpeel https://example.com --chunk 2000 --chunk-strategy semantic

# Disable content pruning (on by default)
npx webpeel https://example.com --full-content

# Combined: prune → focus → budget = max savings
npx webpeel https://example.com --focus "pricing" --budget 3000

Flag	Description	Default
`--focus <query>`	BM25 query filter — keeps relevant blocks only	—
`--chunk <tokens>`	Split output into chunks of N tokens	—
`--chunk-strategy`	`semantic`, `fixed`, or `paragraph`	semantic
`--chunk-overlap <n>`	Overlap tokens between chunks	0
`--full-content`	Disable content pruning	pruning ON
`--budget <tokens>`	Hard token cap on output	—

Infinite Scroll

Automatically scroll to load all lazy content before extracting.

# Smart auto-scroll (detects when page stops growing)
npx webpeel https://example.com --scroll-extract

# Fixed number of scrolls
npx webpeel https://example.com --scroll-extract 10

# With timeout
npx webpeel https://example.com --scroll-extract --scroll-extract-timeout 15000

`webpeel agent <prompt>`

Run autonomous web research agent (BYOK LLM).

# Basic agent
npx webpeel agent "Find the top 5 AI coding tools" --llm-key $OPENAI_API_KEY

# With schema
npx webpeel agent "Compare AI tools" \
  --llm-key $OPENAI_API_KEY \
  --schema '{"type":"array","items":{"properties":{"name":{"type":"string"}}}}'

# With starting URLs
npx webpeel agent "Extract pricing" \
  --urls https://example.com,https://example.org \
  --llm-key $OPENAI_API_KEY

`webpeel track <url>`

Track content changes.

npx webpeel track https://example.com
# Returns fingerprint for change detection

`webpeel summarize <url>`

Generate AI summary.

npx webpeel summarize https://example.com --llm-key $OPENAI_API_KEY

# Custom prompt
npx webpeel summarize https://example.com \
  --llm-key $OPENAI_API_KEY \
  --prompt "Summarize in 3 bullet points"

`webpeel brand <url>`

Extract branding and design system.

npx webpeel brand https://example.com --json

`webpeel jobs`

List active async jobs.

npx webpeel jobs --json

`webpeel job <id>`

Get job status.

npx webpeel job crawl_abc123 --json

`webpeel profile <action> [name]` v0.10.0

Manage persistent browser profiles. Profiles save cookies, localStorage, and auth state so you can reuse logged-in sessions across requests.

# Create a new profile
npx webpeel profile create myprofile

# List all saved profiles
npx webpeel profile list

# Show details for a specific profile
npx webpeel profile show myprofile

# Delete a profile
npx webpeel profile delete myprofile

Use a profile when fetching:

# Fetch using a saved profile (cookies + auth preserved)
npx webpeel https://example.com --profile myprofile --stealth

# Search using a profile
npx webpeel search --site amazon "laptop stand" --profile myprofile --json

Subcommands

Subcommand	Description
`create <name>`	Create and initialize a new browser profile
`list`	List all saved profiles
`show <name>`	Show profile metadata (created date, domain list, cookie count)
`delete <name>`	Delete a saved profile

`webpeel hotels <destination>` v0.10.0

Search for hotels across multiple sources in parallel. Results are merged, deduplicated, and sorted. Expedia is supported thanks to Stealth v2.

# Basic hotel search
npx webpeel hotels "Paris"

# With check-in and check-out dates
npx webpeel hotels "New York" --checkin 2026-04-10 --checkout 2026-04-14

# Sort by price (cheapest first)
npx webpeel hotels "Tokyo" --checkin 2026-05-01 --checkout 2026-05-05 --sort price

# Sort by rating
npx webpeel hotels "London" --checkin 2026-06-01 --checkout 2026-06-03 --sort rating

# JSON output for AI agents
npx webpeel hotels "Barcelona" --checkin 2026-07-01 --checkout 2026-07-07 --json

# Limit results
npx webpeel hotels "Berlin" --checkin 2026-08-01 -n 10 --json

Options

Option	Description
`--checkin <date>`	Check-in date in `YYYY-MM-DD` format
`--checkout <date>`	Check-out date in `YYYY-MM-DD` format
`--sort <field>`	Sort results: `price`, `rating`, or `relevance` (default)
`-n, --count <n>`	Max results to return (default: 10)
`--json`	Output as JSON array
`--table`	Output as formatted table
`-s, --silent`	Silent mode (no spinner)

Environment Variables

Variable	Description
`WEBPEEL_API_KEY`	API key for authentication
`OPENAI_API_KEY`	OpenAI API key for LLM features
`ANTHROPIC_API_KEY`	Anthropic API key for LLM features
`GOOGLE_API_KEY`	Google API key for Gemini / LLM features
`WEBPEEL_BRAVE_API_KEY`	Brave Search API key for `webpeel search` / `webpeel answer`
`WEBPEEL_LLM_MODEL`	LLM model (default: gpt-4o-mini)
`WEBPEEL_LLM_BASE_URL`	LLM API base URL
`WEBPEEL_API_URL`	WebPeel API URL (self-hosted)

Examples

Extract Product Data

npx webpeel https://example.com/product \
  --extract '{"title": "h1", "price": ".price", "rating": ".stars"}' \
  --json

Monitor Price Changes

#!/bin/bash
URL="https://example.com/product"

# First run
FINGERPRINT=$(npx webpeel $URL --json | jq -r '.fingerprint')
echo $FINGERPRINT > fingerprint.txt

# Later runs
NEW_FP=$(npx webpeel $URL --json | jq -r '.fingerprint')
OLD_FP=$(cat fingerprint.txt)

if [ "$NEW_FP" != "$OLD_FP" ]; then
  echo "Price changed!"
fi

Batch Download Documentation

# Discover all docs URLs
npx webpeel map https://docs.example.com \
  --include "/docs/" \
  --json > urls.json

# Extract URLs
cat urls.json | jq -r '.urls[]' > urls.txt

# Batch download
npx webpeel batch urls.txt -o docs/ --selector "article"

MCP Server

WebPeel exposes an MCP (Model Context Protocol) server so AI assistants like Claude, Cursor, and Windsurf can fetch, search, and crawl the web on your behalf.

Remote URL — No Install Needed

The fastest way to connect. Paste this into your AI client's MCP config — no npm install required:

{
  "mcpServers": {
    "webpeel": {
      "url": "https://api.webpeel.dev/v2/mcp",
      "headers": {
        "Authorization": "Bearer YOUR-API-KEY"
      }
    }
  }
}

Or use the Firecrawl-style key-in-URL format (works with clients that don't support custom headers):

{
  "mcpServers": {
    "webpeel": {
      "url": "https://api.webpeel.dev/YOUR-API-KEY/v2/mcp"
    }
  }
}

One-Click Install for Cursor

➕ Add WebPeel to Cursor

After clicking, replace YOUR-API-KEY in Cursor's MCP settings with your actual key from app.webpeel.dev.

Claude Desktop Config

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "webpeel": {
      "command": "npx",
      "args": ["-y", "webpeel", "--mcp"],
      "env": {
        "WEBPEEL_API_KEY": "YOUR-API-KEY"
      }
    }
  }
}

Windsurf Config

{
  "mcpServers": {
    "webpeel": {
      "command": "npx",
      "args": ["-y", "webpeel", "--mcp"]
    }
  }
}

Set your API key via npx webpeel login or WEBPEEL_API_KEY environment variable.

Available MCP Tools

webpeel_fetch — Fetch any URL → clean Markdown
webpeel_search — Search the web, get titles/URLs/snippets
webpeel_crawl — Crawl a site following links
webpeel_map — Discover all URLs on a domain
webpeel_extract — Extract structured data via CSS selectors or AI
webpeel_batch — Fetch multiple URLs concurrently
webpeel_agent — Autonomous web research agent (BYOK)

CLI Reference

Installation

Quick Start

Global Options

Commands

webpeel <url>

Options

webpeel search <query>

Options

webpeel sites v0.9.0

webpeel answer <question>

Options

webpeel batch [file]

Options

webpeel crawl <url>

Options

webpeel map <url>

Options

webpeel login

webpeel logout

webpeel whoami

webpeel usage

webpeel config [action] [key] [value]

webpeel cache <action>

webpeel serve

webpeel mcp

webpeel research <query>

Token Efficiency Flags

Infinite Scroll

webpeel agent <prompt>

webpeel track <url>

webpeel summarize <url>

webpeel brand <url>

webpeel jobs

webpeel job <id>

webpeel profile <action> [name] v0.10.0

Subcommands

webpeel hotels <destination> v0.10.0

Options

Environment Variables

Examples

Extract Product Data

Monitor Price Changes

Batch Download Documentation

MCP Server

Remote URL — No Install Needed

One-Click Install for Cursor

Claude Desktop Config

Windsurf Config

Available MCP Tools

`webpeel <url>`

`webpeel search <query>`

`webpeel sites` v0.9.0

`webpeel answer <question>`

`webpeel batch [file]`

`webpeel crawl <url>`

`webpeel map <url>`

`webpeel login`

`webpeel logout`

`webpeel whoami`

`webpeel usage`

`webpeel config [action] [key] [value]`

`webpeel cache <action>`

`webpeel serve`

`webpeel mcp`

`webpeel research <query>`

`webpeel agent <prompt>`

`webpeel track <url>`

`webpeel summarize <url>`

`webpeel brand <url>`

`webpeel jobs`

`webpeel job <id>`

`webpeel profile <action> [name]` v0.10.0

`webpeel hotels <destination>` v0.10.0