CLI Reference

Command-line interface for WebPeel. 20+ commands for web scraping, searching, crawling, and hotel search.

Installation

npm install -g webpeel

Quick Start

# Fetch a URL
npx webpeel https://example.com

# Output as JSON
npx webpeel https://example.com --json

# Use browser rendering
npx webpeel https://example.com --render

# Search the web (DuckDuckGo default)
npx webpeel search "AI news"

# Search with Brave (BYOK)
npx webpeel search "AI news" --provider brave --search-api-key $BRAVE_KEY

# Get a cited answer (BYOK)
npx webpeel answer "What is MCP?" --llm openai --llm-api-key $OPENAI_API_KEY

# v0.10.0: CSS schema extraction (auto-detected by domain)
npx webpeel https://www.amazon.com/s?k=keyboard --json
npx webpeel https://news.ycombinator.com --schema hn --json
npx webpeel --list-schemas

# v0.10.0: LLM extraction (BYOK)
npx webpeel https://example.com/product --llm-extract "title, price, rating" --json

# v0.10.0: Hotel search
npx webpeel hotels "Paris" --checkin 2026-03-01 --checkout 2026-03-05 --sort price

# v0.10.0: Browser profiles
npx webpeel profile create myprofile
npx webpeel https://example.com --profile myprofile --stealth

Global Options

Option Description
-s, --silent Silent mode (no spinner, no progress)
--json Output as JSON
-h, --help Show help
-V, --version Show version number

Commands

webpeel <url>

Fetch and extract content from a URL.

# Basic usage
npx webpeel https://example.com

# Output formats
npx webpeel https://example.com --html
npx webpeel https://example.com --text
npx webpeel https://example.com --json

# Browser rendering
npx webpeel https://example.com --render
npx webpeel https://example.com --stealth
npx webpeel https://example.com --wait 5000

# Content filtering
npx webpeel https://example.com --selector "article"
npx webpeel https://example.com --exclude ".sidebar" ".ads"
npx webpeel https://example.com --only-main-content

# Advanced options
npx webpeel https://example.com --screenshot
npx webpeel https://example.com --screenshot screenshot.png
npx webpeel https://example.com --full-page
npx webpeel https://example.com --max-tokens 5000
npx webpeel https://example.com --cache 1h

# v0.9.0: Agent mode (JSON + budget + extraction + silent)
npx webpeel https://example.com --agent

# v0.9.0: Auto-extract listings (product cards, search results)
npx webpeel https://example.com --extract-all
npx webpeel https://example.com --extract-all --table
npx webpeel https://example.com --extract-all --csv

# v0.9.0: Token budget
npx webpeel https://example.com --budget 4000

# v0.10.0: CSS schema extraction
npx webpeel https://www.amazon.com/s?k=laptop --schema amazon --json
npx webpeel --list-schemas

# v0.10.0: LLM extraction (BYOK — uses OPENAI_API_KEY / WEBPEEL_LLM_BASE_URL)
npx webpeel https://example.com/product --llm-extract "title, price, in-stock" --json
npx webpeel https://example.com --llm-extract --json

# v0.10.0: Browser profile
npx webpeel https://example.com --profile myprofile

Options

Option Type Description
--html boolean Output raw HTML
--text boolean Output plain text
-r, --render boolean Use headless browser
--stealth boolean Use stealth mode
-w, --wait <ms> number Wait time after page load (ms)
--selector <css> string CSS selector to extract
--exclude <selectors...> string[] CSS selectors to exclude
--include-tags <tags> string Comma-separated tags to include
--exclude-tags <tags> string Comma-separated tags to exclude
--only-main-content boolean Extract only main/article content
--screenshot [path] string Capture screenshot (optional path)
--full-page boolean Full-page screenshot
--links boolean Output only links
--images boolean Output image URLs
--meta boolean Output only metadata
--max-tokens <n> number Max token count (truncate)
--cache <ttl> string Cache TTL (e.g., "5m", "1h", "1d")
--location <country> string ISO country code
--language <lang> string Language preference
--agent boolean Agent mode: sets --json, --silent, --extract-all, --budget 4000 v0.9.0
--extract-all boolean Auto-detect and extract repeated listings (product cards, search results) v0.9.0
--budget <n> number Smart token budget — distill content to fit within N tokens v0.9.0
--table boolean Output extracted listings as a formatted table v0.9.0
--csv boolean Output extracted listings as CSV v0.9.0
--pages <n> number Multi-page pagination (1-10) v0.9.0
--scroll-extract [n] number Infinite scroll extraction v0.9.0
--schema <name> string Apply a CSS extraction schema (e.g. amazon, booking, ebay, yelp, walmart, hn). Auto-detected by domain if omitted. v0.10.0
--list-schemas boolean List all bundled CSS extraction schemas and exit v0.10.0
--llm-extract [instruction] string Extract structured data using an LLM (BYOK). Optional natural-language instruction, e.g. "title, price, rating". Uses WEBPEEL_LLM_BASE_URL / OPENAI_API_KEY. v0.10.0
--profile <name> string Load a saved browser profile (cookies, localStorage, auth). Create profiles with webpeel profile create. v0.10.0

webpeel search <query>

Search the web using DuckDuckGo (default, free) or Brave Search (BYOK). New in v0.9.0: use --site to search specific sites with auto-extraction.

# Basic search (DuckDuckGo)
npx webpeel search "AI news"

# Use Brave Search (BYOK)
npx webpeel search "AI news" --provider brave --search-api-key $BRAVE_KEY

# Or set once in config
npx webpeel config set braveApiKey $BRAVE_KEY
npx webpeel search "AI news" --provider brave

# Limit results
npx webpeel search "python tutorials" -n 10

# JSON output
npx webpeel search "web scraping" --json

# v0.9.0: Site-specific search (no URL knowledge needed!)
npx webpeel search --site ebay "charizard card"
npx webpeel search --site amazon "mechanical keyboard" --json
npx webpeel search --site walmart "ps5" --table
npx webpeel search --site github "web scraper typescript"

# List all supported sites
npx webpeel sites

Options

Option Description
-n, --count <n> Number of results (1-10, default: 5)
--provider <provider> Search provider: duckduckgo (default) or brave
--search-api-key <key> Brave Search API key (or env WEBPEEL_BRAVE_API_KEY)
--site <site> Search a specific site (e.g. ebay, amazon, github). Run webpeel sites for full list. v0.9.0
--json Output as JSON
--table Output site-search results as a formatted table v0.9.0
--csv Output site-search results as CSV v0.9.0
--profile <name> Load a saved browser profile for site-aware searches v0.10.0
-s, --silent Silent mode

webpeel sites v0.9.0

List all supported sites for site-aware search. 27 sites across 7 categories: shopping, general, social, tech, jobs, real-estate, food.

# List all sites
npx webpeel sites

# Filter by category
npx webpeel sites --category shopping

# JSON output
npx webpeel sites --json

webpeel answer <question>

Ask a question, search the web, fetch sources, and get an AI-generated answer with citations. BYOK — bring your own LLM API key.

# Answer with citations (DuckDuckGo search by default)
npx webpeel answer "What is MCP?" --llm openai --llm-api-key $OPENAI_API_KEY

# Use Brave Search for higher-quality results (BYOK)
npx webpeel answer "What is MCP?" \
  --provider brave \
  --search-api-key $BRAVE_KEY \
  --llm anthropic \
  --llm-api-key $ANTHROPIC_API_KEY \
  --max-sources 5

# JSON output
npx webpeel answer "Compare DuckDuckGo vs Brave Search" \
  --llm openai --llm-api-key $OPENAI_API_KEY \
  --json

Options

Option Description
--provider <provider> Search provider: duckduckgo (default) or brave
--search-api-key <key> Brave Search API key (or env WEBPEEL_BRAVE_API_KEY)
--llm <provider> LLM provider: openai, anthropic, or google
--llm-api-key <key> LLM API key (or env OPENAI_API_KEY / ANTHROPIC_API_KEY / GOOGLE_API_KEY)
--llm-model <model> Optional LLM model name (provider-specific)
--max-sources <n> Maximum sources to fetch (1-10, default: 5)
--json Output as JSON
-s, --silent Silent mode

webpeel batch [file]

Fetch multiple URLs from file or stdin.

# From file
npx webpeel batch urls.txt

# From stdin
cat urls.txt | npx webpeel batch

# With concurrency
npx webpeel batch urls.txt -c 5

# Save to directory
npx webpeel batch urls.txt -o output/

# JSON output
npx webpeel batch urls.txt --json

Options

Option Description
-c, --concurrency <n> Max concurrent fetches (default: 3)
-o, --output <dir> Output directory (one file per URL)
--json Output as JSON array
-r, --render Use browser rendering for all
--selector <css> CSS selector to extract

webpeel crawl <url>

Crawl a website recursively.

# Basic crawl
npx webpeel crawl https://example.com

# Limit pages and depth
npx webpeel crawl https://example.com --max-pages 100 --max-depth 3

# Exclude patterns
npx webpeel crawl https://example.com --exclude "/admin/" "/login"

# Domain restrictions
npx webpeel crawl https://example.com --allowed-domains "example.com" "docs.example.com"

# Ignore robots.txt
npx webpeel crawl https://example.com --ignore-robots

# With browser rendering
npx webpeel crawl https://example.com --render --stealth

Options

Option Description
--max-pages <n> Max pages to crawl (default: 10, max: 100)
--max-depth <n> Max depth (default: 2, max: 5)
--allowed-domains <domains...> Only crawl these domains
--exclude <patterns...> Exclude URL patterns (regex)
--ignore-robots Ignore robots.txt
--rate-limit <ms> Rate limit (default: 1000ms)
-r, --render Use browser rendering
--stealth Use stealth mode

webpeel map <url>

Discover all URLs on a domain.

# Map a domain
npx webpeel map https://example.com

# Skip sitemap
npx webpeel map https://example.com --no-sitemap

# Skip crawl
npx webpeel map https://example.com --no-crawl

# Limit results
npx webpeel map https://example.com --max 1000

# Filter patterns
npx webpeel map https://example.com --include "/docs/" --exclude "/api/"

Options

Option Description
--no-sitemap Skip sitemap.xml discovery
--no-crawl Skip homepage crawl
--max <n> Max URLs (default: 5000)
--include <patterns...> Include URL patterns (regex)
--exclude <patterns...> Exclude URL patterns (regex)

webpeel login

Authenticate CLI with API key.

npx webpeel login
# Prompts for API key, saves to ~/.webpeel/config.json

webpeel logout

Clear saved credentials.

npx webpeel logout

webpeel whoami

Show authentication status.

npx webpeel whoami
# Logged in with API key: sk-1234...abcd
# Plan: Pro
# Config: ~/.webpeel/config.json

webpeel usage

Show usage and quota.

npx webpeel usage
# Free: 45/125 fetches this week (36%)
# Resets: Mon Feb 17 2026 00:00:00 EST

webpeel config [action] [key] [value]

View or update CLI configuration (stored in ~/.webpeel/config.json).

# Show overview
npx webpeel config

# Get specific key
npx webpeel config get braveApiKey

# Set Brave Search API key (BYOK)
npx webpeel config set braveApiKey $BRAVE_KEY

webpeel cache <action>

Manage local response cache.

# Show cache stats
npx webpeel cache stats

# Clear expired entries
npx webpeel cache clear

# Purge all entries
npx webpeel cache purge

webpeel serve

Start API server.

# Default port 3000
npx webpeel serve

# Custom port
npx webpeel serve -p 8080

webpeel mcp

Start MCP server for Claude/Cursor.

npx webpeel mcp
# Runs on stdio for Claude Desktop / Cursor integration

webpeel research <query>

Multi-source deep research with BM25 relevance ranking. No LLM key needed for sources mode.

# Get ranked sources with relevance scores
npx webpeel research "best web scraping tools 2025" --max-sources 5

# Full synthesis with LLM
npx webpeel research "compare Firecrawl vs Crawl4AI" --llm-key $OPENAI_API_KEY

# JSON output
npx webpeel research "AI trends" --format sources --json
FlagDescriptionDefault
--max-sourcesMax sources to fetch5
--max-depthLink-following depth (1=no follow)2
--formatreport (LLM synthesis) or sources (raw)report
--timeoutTotal timeout in ms60000
--llm-keyOpenAI API key for synthesis
--llm-modelModel for synthesisgpt-4o-mini

Token Efficiency Flags

Save 15-77% on AI tokens. These flags work with any fetch command.

# BM25 query-focused filtering — keep only relevant content
npx webpeel https://en.wikipedia.org/wiki/AI --focus "machine learning"

# Smart chunking — split into LLM-friendly pieces
npx webpeel https://example.com --chunk 2000 --chunk-strategy semantic

# Disable content pruning (on by default)
npx webpeel https://example.com --full-content

# Combined: prune → focus → budget = max savings
npx webpeel https://example.com --focus "pricing" --budget 3000
FlagDescriptionDefault
--focus <query>BM25 query filter — keeps relevant blocks only
--chunk <tokens>Split output into chunks of N tokens
--chunk-strategysemantic, fixed, or paragraphsemantic
--chunk-overlap <n>Overlap tokens between chunks0
--full-contentDisable content pruningpruning ON
--budget <tokens>Hard token cap on output

Infinite Scroll

Automatically scroll to load all lazy content before extracting.

# Smart auto-scroll (detects when page stops growing)
npx webpeel https://example.com --scroll-extract

# Fixed number of scrolls
npx webpeel https://example.com --scroll-extract 10

# With timeout
npx webpeel https://example.com --scroll-extract --scroll-extract-timeout 15000

webpeel agent <prompt>

Run autonomous web research agent (BYOK LLM).

# Basic agent
npx webpeel agent "Find the top 5 AI coding tools" --llm-key $OPENAI_API_KEY

# With schema
npx webpeel agent "Compare AI tools" \
  --llm-key $OPENAI_API_KEY \
  --schema '{"type":"array","items":{"properties":{"name":{"type":"string"}}}}'

# With starting URLs
npx webpeel agent "Extract pricing" \
  --urls https://example.com,https://example.org \
  --llm-key $OPENAI_API_KEY

webpeel track <url>

Track content changes.

npx webpeel track https://example.com
# Returns fingerprint for change detection

webpeel summarize <url>

Generate AI summary.

npx webpeel summarize https://example.com --llm-key $OPENAI_API_KEY

# Custom prompt
npx webpeel summarize https://example.com \
  --llm-key $OPENAI_API_KEY \
  --prompt "Summarize in 3 bullet points"

webpeel brand <url>

Extract branding and design system.

npx webpeel brand https://example.com --json

webpeel jobs

List active async jobs.

npx webpeel jobs --json

webpeel job <id>

Get job status.

npx webpeel job crawl_abc123 --json

webpeel profile <action> [name] v0.10.0

Manage persistent browser profiles. Profiles save cookies, localStorage, and auth state so you can reuse logged-in sessions across requests.

# Create a new profile
npx webpeel profile create myprofile

# List all saved profiles
npx webpeel profile list

# Show details for a specific profile
npx webpeel profile show myprofile

# Delete a profile
npx webpeel profile delete myprofile

Use a profile when fetching:

# Fetch using a saved profile (cookies + auth preserved)
npx webpeel https://example.com --profile myprofile --stealth

# Search using a profile
npx webpeel search --site amazon "laptop stand" --profile myprofile --json

Subcommands

Subcommand Description
create <name> Create and initialize a new browser profile
list List all saved profiles
show <name> Show profile metadata (created date, domain list, cookie count)
delete <name> Delete a saved profile

webpeel hotels <destination> v0.10.0

Search for hotels across multiple sources in parallel. Results are merged, deduplicated, and sorted. Expedia is supported thanks to Stealth v2.

# Basic hotel search
npx webpeel hotels "Paris"

# With check-in and check-out dates
npx webpeel hotels "New York" --checkin 2026-04-10 --checkout 2026-04-14

# Sort by price (cheapest first)
npx webpeel hotels "Tokyo" --checkin 2026-05-01 --checkout 2026-05-05 --sort price

# Sort by rating
npx webpeel hotels "London" --checkin 2026-06-01 --checkout 2026-06-03 --sort rating

# JSON output for AI agents
npx webpeel hotels "Barcelona" --checkin 2026-07-01 --checkout 2026-07-07 --json

# Limit results
npx webpeel hotels "Berlin" --checkin 2026-08-01 -n 10 --json

Options

Option Description
--checkin <date> Check-in date in YYYY-MM-DD format
--checkout <date> Check-out date in YYYY-MM-DD format
--sort <field> Sort results: price, rating, or relevance (default)
-n, --count <n> Max results to return (default: 10)
--json Output as JSON array
--table Output as formatted table
-s, --silent Silent mode (no spinner)

Environment Variables

Variable Description
WEBPEEL_API_KEY API key for authentication
OPENAI_API_KEY OpenAI API key for LLM features
ANTHROPIC_API_KEY Anthropic API key for LLM features
GOOGLE_API_KEY Google API key for Gemini / LLM features
WEBPEEL_BRAVE_API_KEY Brave Search API key for webpeel search / webpeel answer
WEBPEEL_LLM_MODEL LLM model (default: gpt-4o-mini)
WEBPEEL_LLM_BASE_URL LLM API base URL
WEBPEEL_API_URL WebPeel API URL (self-hosted)

Examples

Extract Product Data

npx webpeel https://example.com/product \
  --extract '{"title": "h1", "price": ".price", "rating": ".stars"}' \
  --json

Monitor Price Changes

#!/bin/bash
URL="https://example.com/product"

# First run
FINGERPRINT=$(npx webpeel $URL --json | jq -r '.fingerprint')
echo $FINGERPRINT > fingerprint.txt

# Later runs
NEW_FP=$(npx webpeel $URL --json | jq -r '.fingerprint')
OLD_FP=$(cat fingerprint.txt)

if [ "$NEW_FP" != "$OLD_FP" ]; then
  echo "Price changed!"
fi

Batch Download Documentation

# Discover all docs URLs
npx webpeel map https://docs.example.com \
  --include "/docs/" \
  --json > urls.json

# Extract URLs
cat urls.json | jq -r '.urls[]' > urls.txt

# Batch download
npx webpeel batch urls.txt -o docs/ --selector "article"

MCP Server

WebPeel exposes an MCP (Model Context Protocol) server so AI assistants like Claude, Cursor, and Windsurf can fetch, search, and crawl the web on your behalf.

Remote URL — No Install Needed

The fastest way to connect. Paste this into your AI client's MCP config — no npm install required:

{
  "mcpServers": {
    "webpeel": {
      "url": "https://api.webpeel.dev/v2/mcp",
      "headers": {
        "Authorization": "Bearer YOUR-API-KEY"
      }
    }
  }
}

Or use the Firecrawl-style key-in-URL format (works with clients that don't support custom headers):

{
  "mcpServers": {
    "webpeel": {
      "url": "https://api.webpeel.dev/YOUR-API-KEY/v2/mcp"
    }
  }
}

One-Click Install for Cursor

➕ Add WebPeel to Cursor

After clicking, replace YOUR-API-KEY in Cursor's MCP settings with your actual key from app.webpeel.dev.

Claude Desktop Config

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "webpeel": {
      "command": "npx",
      "args": ["-y", "webpeel", "--mcp"],
      "env": {
        "WEBPEEL_API_KEY": "YOUR-API-KEY"
      }
    }
  }
}

Windsurf Config

{
  "mcpServers": {
    "webpeel": {
      "command": "npx",
      "args": ["-y", "webpeel", "--mcp"]
    }
  }
}

Set your API key via npx webpeel login or WEBPEEL_API_KEY environment variable.

Available MCP Tools