CLI Reference
Command-line interface for WebPeel. 20+ commands for web scraping, searching, crawling, and hotel search.
Installation
npm install -g webpeel
Quick Start
# Fetch a URL
npx webpeel https://example.com
# Output as JSON
npx webpeel https://example.com --json
# Use browser rendering
npx webpeel https://example.com --render
# Search the web (DuckDuckGo default)
npx webpeel search "AI news"
# Search with Brave (BYOK)
npx webpeel search "AI news" --provider brave --search-api-key $BRAVE_KEY
# Get a cited answer (BYOK)
npx webpeel answer "What is MCP?" --llm openai --llm-api-key $OPENAI_API_KEY
# v0.10.0: CSS schema extraction (auto-detected by domain)
npx webpeel https://www.amazon.com/s?k=keyboard --json
npx webpeel https://news.ycombinator.com --schema hn --json
npx webpeel --list-schemas
# v0.10.0: LLM extraction (BYOK)
npx webpeel https://example.com/product --llm-extract "title, price, rating" --json
# v0.10.0: Hotel search
npx webpeel hotels "Paris" --checkin 2026-03-01 --checkout 2026-03-05 --sort price
# v0.10.0: Browser profiles
npx webpeel profile create myprofile
npx webpeel https://example.com --profile myprofile --stealth
Global Options
| Option | Description |
|---|---|
-s, --silent |
Silent mode (no spinner, no progress) |
--json |
Output as JSON |
-h, --help |
Show help |
-V, --version |
Show version number |
Commands
webpeel <url>
Fetch and extract content from a URL.
# Basic usage
npx webpeel https://example.com
# Output formats
npx webpeel https://example.com --html
npx webpeel https://example.com --text
npx webpeel https://example.com --json
# Browser rendering
npx webpeel https://example.com --render
npx webpeel https://example.com --stealth
npx webpeel https://example.com --wait 5000
# Content filtering
npx webpeel https://example.com --selector "article"
npx webpeel https://example.com --exclude ".sidebar" ".ads"
npx webpeel https://example.com --only-main-content
# Advanced options
npx webpeel https://example.com --screenshot
npx webpeel https://example.com --screenshot screenshot.png
npx webpeel https://example.com --full-page
npx webpeel https://example.com --max-tokens 5000
npx webpeel https://example.com --cache 1h
# v0.9.0: Agent mode (JSON + budget + extraction + silent)
npx webpeel https://example.com --agent
# v0.9.0: Auto-extract listings (product cards, search results)
npx webpeel https://example.com --extract-all
npx webpeel https://example.com --extract-all --table
npx webpeel https://example.com --extract-all --csv
# v0.9.0: Token budget
npx webpeel https://example.com --budget 4000
# v0.10.0: CSS schema extraction
npx webpeel https://www.amazon.com/s?k=laptop --schema amazon --json
npx webpeel --list-schemas
# v0.10.0: LLM extraction (BYOK — uses OPENAI_API_KEY / WEBPEEL_LLM_BASE_URL)
npx webpeel https://example.com/product --llm-extract "title, price, in-stock" --json
npx webpeel https://example.com --llm-extract --json
# v0.10.0: Browser profile
npx webpeel https://example.com --profile myprofile
Options
| Option | Type | Description |
|---|---|---|
--html |
boolean | Output raw HTML |
--text |
boolean | Output plain text |
-r, --render |
boolean | Use headless browser |
--stealth |
boolean | Use stealth mode |
-w, --wait <ms> |
number | Wait time after page load (ms) |
--selector <css> |
string | CSS selector to extract |
--exclude <selectors...> |
string[] | CSS selectors to exclude |
--include-tags <tags> |
string | Comma-separated tags to include |
--exclude-tags <tags> |
string | Comma-separated tags to exclude |
--only-main-content |
boolean | Extract only main/article content |
--screenshot [path] |
string | Capture screenshot (optional path) |
--full-page |
boolean | Full-page screenshot |
--links |
boolean | Output only links |
--images |
boolean | Output image URLs |
--meta |
boolean | Output only metadata |
--max-tokens <n> |
number | Max token count (truncate) |
--cache <ttl> |
string | Cache TTL (e.g., "5m", "1h", "1d") |
--location <country> |
string | ISO country code |
--language <lang> |
string | Language preference |
--agent |
boolean | Agent mode: sets --json, --silent, --extract-all, --budget 4000 v0.9.0 |
--extract-all |
boolean | Auto-detect and extract repeated listings (product cards, search results) v0.9.0 |
--budget <n> |
number | Smart token budget — distill content to fit within N tokens v0.9.0 |
--table |
boolean | Output extracted listings as a formatted table v0.9.0 |
--csv |
boolean | Output extracted listings as CSV v0.9.0 |
--pages <n> |
number | Multi-page pagination (1-10) v0.9.0 |
--scroll-extract [n] |
number | Infinite scroll extraction v0.9.0 |
--schema <name> |
string | Apply a CSS extraction schema (e.g. amazon, booking, ebay, yelp, walmart, hn). Auto-detected by domain if omitted. v0.10.0 |
--list-schemas |
boolean | List all bundled CSS extraction schemas and exit v0.10.0 |
--llm-extract [instruction] |
string | Extract structured data using an LLM (BYOK). Optional natural-language instruction, e.g. "title, price, rating". Uses WEBPEEL_LLM_BASE_URL / OPENAI_API_KEY. v0.10.0 |
--profile <name> |
string | Load a saved browser profile (cookies, localStorage, auth). Create profiles with webpeel profile create. v0.10.0 |
webpeel search <query>
Search the web using DuckDuckGo (default, free) or Brave Search (BYOK). New in v0.9.0: use --site to search specific sites with auto-extraction.
# Basic search (DuckDuckGo)
npx webpeel search "AI news"
# Use Brave Search (BYOK)
npx webpeel search "AI news" --provider brave --search-api-key $BRAVE_KEY
# Or set once in config
npx webpeel config set braveApiKey $BRAVE_KEY
npx webpeel search "AI news" --provider brave
# Limit results
npx webpeel search "python tutorials" -n 10
# JSON output
npx webpeel search "web scraping" --json
# v0.9.0: Site-specific search (no URL knowledge needed!)
npx webpeel search --site ebay "charizard card"
npx webpeel search --site amazon "mechanical keyboard" --json
npx webpeel search --site walmart "ps5" --table
npx webpeel search --site github "web scraper typescript"
# List all supported sites
npx webpeel sites
Options
| Option | Description |
|---|---|
-n, --count <n> |
Number of results (1-10, default: 5) |
--provider <provider> |
Search provider: duckduckgo (default) or brave |
--search-api-key <key> |
Brave Search API key (or env WEBPEEL_BRAVE_API_KEY) |
--site <site> |
Search a specific site (e.g. ebay, amazon, github). Run webpeel sites for full list. v0.9.0 |
--json |
Output as JSON |
--table |
Output site-search results as a formatted table v0.9.0 |
--csv |
Output site-search results as CSV v0.9.0 |
--profile <name> |
Load a saved browser profile for site-aware searches v0.10.0 |
-s, --silent |
Silent mode |
webpeel sites v0.9.0
List all supported sites for site-aware search. 27 sites across 7 categories: shopping, general, social, tech, jobs, real-estate, food.
# List all sites
npx webpeel sites
# Filter by category
npx webpeel sites --category shopping
# JSON output
npx webpeel sites --json
webpeel answer <question>
Ask a question, search the web, fetch sources, and get an AI-generated answer with citations. BYOK — bring your own LLM API key.
# Answer with citations (DuckDuckGo search by default)
npx webpeel answer "What is MCP?" --llm openai --llm-api-key $OPENAI_API_KEY
# Use Brave Search for higher-quality results (BYOK)
npx webpeel answer "What is MCP?" \
--provider brave \
--search-api-key $BRAVE_KEY \
--llm anthropic \
--llm-api-key $ANTHROPIC_API_KEY \
--max-sources 5
# JSON output
npx webpeel answer "Compare DuckDuckGo vs Brave Search" \
--llm openai --llm-api-key $OPENAI_API_KEY \
--json
Options
| Option | Description |
|---|---|
--provider <provider> |
Search provider: duckduckgo (default) or brave |
--search-api-key <key> |
Brave Search API key (or env WEBPEEL_BRAVE_API_KEY) |
--llm <provider> |
LLM provider: openai, anthropic, or google |
--llm-api-key <key> |
LLM API key (or env OPENAI_API_KEY / ANTHROPIC_API_KEY / GOOGLE_API_KEY) |
--llm-model <model> |
Optional LLM model name (provider-specific) |
--max-sources <n> |
Maximum sources to fetch (1-10, default: 5) |
--json |
Output as JSON |
-s, --silent |
Silent mode |
webpeel batch [file]
Fetch multiple URLs from file or stdin.
# From file
npx webpeel batch urls.txt
# From stdin
cat urls.txt | npx webpeel batch
# With concurrency
npx webpeel batch urls.txt -c 5
# Save to directory
npx webpeel batch urls.txt -o output/
# JSON output
npx webpeel batch urls.txt --json
Options
| Option | Description |
|---|---|
-c, --concurrency <n> |
Max concurrent fetches (default: 3) |
-o, --output <dir> |
Output directory (one file per URL) |
--json |
Output as JSON array |
-r, --render |
Use browser rendering for all |
--selector <css> |
CSS selector to extract |
webpeel crawl <url>
Crawl a website recursively.
# Basic crawl
npx webpeel crawl https://example.com
# Limit pages and depth
npx webpeel crawl https://example.com --max-pages 100 --max-depth 3
# Exclude patterns
npx webpeel crawl https://example.com --exclude "/admin/" "/login"
# Domain restrictions
npx webpeel crawl https://example.com --allowed-domains "example.com" "docs.example.com"
# Ignore robots.txt
npx webpeel crawl https://example.com --ignore-robots
# With browser rendering
npx webpeel crawl https://example.com --render --stealth
Options
| Option | Description |
|---|---|
--max-pages <n> |
Max pages to crawl (default: 10, max: 100) |
--max-depth <n> |
Max depth (default: 2, max: 5) |
--allowed-domains <domains...> |
Only crawl these domains |
--exclude <patterns...> |
Exclude URL patterns (regex) |
--ignore-robots |
Ignore robots.txt |
--rate-limit <ms> |
Rate limit (default: 1000ms) |
-r, --render |
Use browser rendering |
--stealth |
Use stealth mode |
webpeel map <url>
Discover all URLs on a domain.
# Map a domain
npx webpeel map https://example.com
# Skip sitemap
npx webpeel map https://example.com --no-sitemap
# Skip crawl
npx webpeel map https://example.com --no-crawl
# Limit results
npx webpeel map https://example.com --max 1000
# Filter patterns
npx webpeel map https://example.com --include "/docs/" --exclude "/api/"
Options
| Option | Description |
|---|---|
--no-sitemap |
Skip sitemap.xml discovery |
--no-crawl |
Skip homepage crawl |
--max <n> |
Max URLs (default: 5000) |
--include <patterns...> |
Include URL patterns (regex) |
--exclude <patterns...> |
Exclude URL patterns (regex) |
webpeel login
Authenticate CLI with API key.
npx webpeel login
# Prompts for API key, saves to ~/.webpeel/config.json
webpeel logout
Clear saved credentials.
npx webpeel logout
webpeel whoami
Show authentication status.
npx webpeel whoami
# Logged in with API key: sk-1234...abcd
# Plan: Pro
# Config: ~/.webpeel/config.json
webpeel usage
Show usage and quota.
npx webpeel usage
# Free: 45/125 fetches this week (36%)
# Resets: Mon Feb 17 2026 00:00:00 EST
webpeel config [action] [key] [value]
View or update CLI configuration (stored in ~/.webpeel/config.json).
# Show overview
npx webpeel config
# Get specific key
npx webpeel config get braveApiKey
# Set Brave Search API key (BYOK)
npx webpeel config set braveApiKey $BRAVE_KEY
webpeel cache <action>
Manage local response cache.
# Show cache stats
npx webpeel cache stats
# Clear expired entries
npx webpeel cache clear
# Purge all entries
npx webpeel cache purge
webpeel serve
Start API server.
# Default port 3000
npx webpeel serve
# Custom port
npx webpeel serve -p 8080
webpeel mcp
Start MCP server for Claude/Cursor.
npx webpeel mcp
# Runs on stdio for Claude Desktop / Cursor integration
webpeel research <query>
Multi-source deep research with BM25 relevance ranking. No LLM key needed for sources mode.
# Get ranked sources with relevance scores
npx webpeel research "best web scraping tools 2025" --max-sources 5
# Full synthesis with LLM
npx webpeel research "compare Firecrawl vs Crawl4AI" --llm-key $OPENAI_API_KEY
# JSON output
npx webpeel research "AI trends" --format sources --json
| Flag | Description | Default |
|---|---|---|
--max-sources | Max sources to fetch | 5 |
--max-depth | Link-following depth (1=no follow) | 2 |
--format | report (LLM synthesis) or sources (raw) | report |
--timeout | Total timeout in ms | 60000 |
--llm-key | OpenAI API key for synthesis | — |
--llm-model | Model for synthesis | gpt-4o-mini |
Token Efficiency Flags
Save 15-77% on AI tokens. These flags work with any fetch command.
# BM25 query-focused filtering — keep only relevant content
npx webpeel https://en.wikipedia.org/wiki/AI --focus "machine learning"
# Smart chunking — split into LLM-friendly pieces
npx webpeel https://example.com --chunk 2000 --chunk-strategy semantic
# Disable content pruning (on by default)
npx webpeel https://example.com --full-content
# Combined: prune → focus → budget = max savings
npx webpeel https://example.com --focus "pricing" --budget 3000
| Flag | Description | Default |
|---|---|---|
--focus <query> | BM25 query filter — keeps relevant blocks only | — |
--chunk <tokens> | Split output into chunks of N tokens | — |
--chunk-strategy | semantic, fixed, or paragraph | semantic |
--chunk-overlap <n> | Overlap tokens between chunks | 0 |
--full-content | Disable content pruning | pruning ON |
--budget <tokens> | Hard token cap on output | — |
Infinite Scroll
Automatically scroll to load all lazy content before extracting.
# Smart auto-scroll (detects when page stops growing)
npx webpeel https://example.com --scroll-extract
# Fixed number of scrolls
npx webpeel https://example.com --scroll-extract 10
# With timeout
npx webpeel https://example.com --scroll-extract --scroll-extract-timeout 15000
webpeel agent <prompt>
Run autonomous web research agent (BYOK LLM).
# Basic agent
npx webpeel agent "Find the top 5 AI coding tools" --llm-key $OPENAI_API_KEY
# With schema
npx webpeel agent "Compare AI tools" \
--llm-key $OPENAI_API_KEY \
--schema '{"type":"array","items":{"properties":{"name":{"type":"string"}}}}'
# With starting URLs
npx webpeel agent "Extract pricing" \
--urls https://example.com,https://example.org \
--llm-key $OPENAI_API_KEY
webpeel track <url>
Track content changes.
npx webpeel track https://example.com
# Returns fingerprint for change detection
webpeel summarize <url>
Generate AI summary.
npx webpeel summarize https://example.com --llm-key $OPENAI_API_KEY
# Custom prompt
npx webpeel summarize https://example.com \
--llm-key $OPENAI_API_KEY \
--prompt "Summarize in 3 bullet points"
webpeel brand <url>
Extract branding and design system.
npx webpeel brand https://example.com --json
webpeel jobs
List active async jobs.
npx webpeel jobs --json
webpeel job <id>
Get job status.
npx webpeel job crawl_abc123 --json
webpeel profile <action> [name] v0.10.0
Manage persistent browser profiles. Profiles save cookies, localStorage, and auth state so you can reuse logged-in sessions across requests.
# Create a new profile
npx webpeel profile create myprofile
# List all saved profiles
npx webpeel profile list
# Show details for a specific profile
npx webpeel profile show myprofile
# Delete a profile
npx webpeel profile delete myprofile
Use a profile when fetching:
# Fetch using a saved profile (cookies + auth preserved)
npx webpeel https://example.com --profile myprofile --stealth
# Search using a profile
npx webpeel search --site amazon "laptop stand" --profile myprofile --json
Subcommands
| Subcommand | Description |
|---|---|
create <name> |
Create and initialize a new browser profile |
list |
List all saved profiles |
show <name> |
Show profile metadata (created date, domain list, cookie count) |
delete <name> |
Delete a saved profile |
webpeel hotels <destination> v0.10.0
Search for hotels across multiple sources in parallel. Results are merged, deduplicated, and sorted. Expedia is supported thanks to Stealth v2.
# Basic hotel search
npx webpeel hotels "Paris"
# With check-in and check-out dates
npx webpeel hotels "New York" --checkin 2026-04-10 --checkout 2026-04-14
# Sort by price (cheapest first)
npx webpeel hotels "Tokyo" --checkin 2026-05-01 --checkout 2026-05-05 --sort price
# Sort by rating
npx webpeel hotels "London" --checkin 2026-06-01 --checkout 2026-06-03 --sort rating
# JSON output for AI agents
npx webpeel hotels "Barcelona" --checkin 2026-07-01 --checkout 2026-07-07 --json
# Limit results
npx webpeel hotels "Berlin" --checkin 2026-08-01 -n 10 --json
Options
| Option | Description |
|---|---|
--checkin <date> |
Check-in date in YYYY-MM-DD format |
--checkout <date> |
Check-out date in YYYY-MM-DD format |
--sort <field> |
Sort results: price, rating, or relevance (default) |
-n, --count <n> |
Max results to return (default: 10) |
--json |
Output as JSON array |
--table |
Output as formatted table |
-s, --silent |
Silent mode (no spinner) |
Environment Variables
| Variable | Description |
|---|---|
WEBPEEL_API_KEY |
API key for authentication |
OPENAI_API_KEY |
OpenAI API key for LLM features |
ANTHROPIC_API_KEY |
Anthropic API key for LLM features |
GOOGLE_API_KEY |
Google API key for Gemini / LLM features |
WEBPEEL_BRAVE_API_KEY |
Brave Search API key for webpeel search / webpeel answer |
WEBPEEL_LLM_MODEL |
LLM model (default: gpt-4o-mini) |
WEBPEEL_LLM_BASE_URL |
LLM API base URL |
WEBPEEL_API_URL |
WebPeel API URL (self-hosted) |
Examples
Extract Product Data
npx webpeel https://example.com/product \
--extract '{"title": "h1", "price": ".price", "rating": ".stars"}' \
--json
Monitor Price Changes
#!/bin/bash
URL="https://example.com/product"
# First run
FINGERPRINT=$(npx webpeel $URL --json | jq -r '.fingerprint')
echo $FINGERPRINT > fingerprint.txt
# Later runs
NEW_FP=$(npx webpeel $URL --json | jq -r '.fingerprint')
OLD_FP=$(cat fingerprint.txt)
if [ "$NEW_FP" != "$OLD_FP" ]; then
echo "Price changed!"
fi
Batch Download Documentation
# Discover all docs URLs
npx webpeel map https://docs.example.com \
--include "/docs/" \
--json > urls.json
# Extract URLs
cat urls.json | jq -r '.urls[]' > urls.txt
# Batch download
npx webpeel batch urls.txt -o docs/ --selector "article"
MCP Server
WebPeel exposes an MCP (Model Context Protocol) server so AI assistants like Claude, Cursor, and Windsurf can fetch, search, and crawl the web on your behalf.
Remote URL — No Install Needed
The fastest way to connect. Paste this into your AI client's MCP config — no npm install required:
{
"mcpServers": {
"webpeel": {
"url": "https://api.webpeel.dev/v2/mcp",
"headers": {
"Authorization": "Bearer YOUR-API-KEY"
}
}
}
}
Or use the Firecrawl-style key-in-URL format (works with clients that don't support custom headers):
{
"mcpServers": {
"webpeel": {
"url": "https://api.webpeel.dev/YOUR-API-KEY/v2/mcp"
}
}
}
One-Click Install for Cursor
After clicking, replace YOUR-API-KEY in Cursor's MCP settings with your actual key from app.webpeel.dev.
Claude Desktop Config
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["-y", "webpeel", "--mcp"],
"env": {
"WEBPEEL_API_KEY": "YOUR-API-KEY"
}
}
}
}
Windsurf Config
{
"mcpServers": {
"webpeel": {
"command": "npx",
"args": ["-y", "webpeel", "--mcp"]
}
}
}
Set your API key via npx webpeel login or WEBPEEL_API_KEY environment variable.
Available MCP Tools
- webpeel_fetch — Fetch any URL → clean Markdown
- webpeel_search — Search the web, get titles/URLs/snippets
- webpeel_crawl — Crawl a site following links
- webpeel_map — Discover all URLs on a domain
- webpeel_extract — Extract structured data via CSS selectors or AI
- webpeel_batch — Fetch multiple URLs concurrently
- webpeel_agent — Autonomous web research agent (BYOK)