Symbol indexes, token-estimated context, semantic chunks — structured output that AI assistants can't produce themselves
AI assistants like Claude Code have native tools for reading files, searching, and listing directories. What they don't have is structured analysis output:
# Symbol index — navigate code without loading full content
batless --mode=index src/main.rs | jq '.symbols[] | "\(.line_start): \(.kind) \(.name)"'
# Token estimation — gate context decisions before loading a file
batless --mode=json --profile=claude file.py | jq '.estimated_llm_tokens'
# Compressed context — language-aware comment and blank stripping
batless --mode=json --profile=claude --strip-comments --strip-blank-lines file.py
# Semantic chunks — split large files at declaration boundaries
batless --mode=json --streaming --chunk-strategy=semantic large_file.rs
# Content hash — detect changes without loading content
batless --mode=json --hash file.rs | jq '.file_hash'These are the outputs batless is built for. For plain file viewing, use cat, bat, or your editor.
Core guarantee: batless will NEVER wait for user input or block your pipeline.
# Linux (x86_64)
curl -L https://github.com/docdyhr/batless/releases/latest/download/batless-x86_64-unknown-linux-gnu.tar.gz | tar xz
# macOS (Intel)
curl -L https://github.com/docdyhr/batless/releases/latest/download/batless-x86_64-apple-darwin.tar.gz | tar xz
# macOS (Apple Silicon)
curl -L https://github.com/docdyhr/batless/releases/latest/download/batless-aarch64-apple-darwin.tar.gz | tar xzcargo install batlessbrew tap docdyhr/batless
brew install batless# Symbol index — structure without loading full content
batless --mode=index src/main.rs
# Multi-file symbol index — walk directory, one NDJSON line per file
batless --mode=index src/ | jq -c 'select(.symbol_count > 0) | {file, symbol_count}'
# Raw AST — full tree-sitter parse tree for deep structural analysis
batless --mode=ast src/lib.rs | jq '.root.type'
# Token estimation — check size before loading into AI context
batless --mode=json --profile=claude file.py | jq '.estimated_llm_tokens'
# Compressed AI context
batless --mode=json --profile=claude --strip-comments --strip-blank-lines src/lib.rs
# Semantic streaming chunks for large files
batless --mode=json --streaming --chunk-strategy=semantic large_file.rs
# Plain text (for piping to other tools)
batless --plain file.py
# Get version info as JSON
batless --version-json| Feature | batless |
bat |
cat / built-in Read |
|---|---|---|---|
| Never Blocks | ✅ Guaranteed | ❌ Uses pager | ✅ |
Symbol Index (--mode=index) |
✅ AST-backed | ❌ | ❌ |
Raw AST (--mode=ast) |
✅ tree-sitter | ❌ | ❌ |
| Multi-file Index (directory) | ✅ NDJSON walk | ❌ | ❌ |
| LLM Token Estimation | ✅ Per-profile | ❌ | ❌ |
| Semantic Chunking | ✅ tree-sitter | ❌ | ❌ |
| Comment/Blank Stripping | ✅ Language-aware | ❌ | ❌ |
| Content Hash | ✅ SHA-256 | ❌ | ❌ |
| JSON Output | ✅ First-class | ❌ | ❌ |
| Syntax Highlighting | ❌ Use bat |
✅ Rich | ❌ |
| Interactive Human Use | ❌ Not the goal | ✅ | ✅ |
- 🚫 NEVER uses a pager - no
less, nomore, no blocking - ⚡ NEVER waits for input - always streams output immediately
- 🔄 NEVER hangs in pipes - safe for
|,>, and subprocess calls - 📊 ALWAYS returns quickly - even on huge files (streaming architecture)
- 🔍 Language auto-detection with manual override (
--language) - 🌳 AST-backed analysis for Rust, Python, JavaScript, TypeScript (regex fallback for others)
- 🌐 Universal plain output — works with any text-based file format
- 📊 Multiple output modes: plain, JSON, summary, index, ast
- 📏 Smart limiting by lines (
--max-lines) and/or bytes (--max-bytes) - 💾 Memory efficient - true streaming, never loads full files
- 🎯 Predictable behavior - same output in terminal or pipe
- 🧠 Dual-view summaries -
linesalways retains the full file whilesummary_linescarries the condensed view - 🔢 Token-aware JSON -
token_countreflects the full file even when the sampledtokensarray is capped (~2K entries) andtokens_truncatedtells you when sampling kicked in
- 🤖 AI-optimized JSON output with metadata, tokens, and summaries
- 📋 Summary mode extracts functions, classes, imports only
- 🔤 Token extraction for LLM context processing
- 🚫 Clean defaults - no decorations unless requested
- 📦 Single ~2MB binary with minimal dependencies
batless has a focused design philosophy. It intentionally does NOT provide:
| Feature | Why Not? | Use Instead |
|---|---|---|
| Pattern Search | That's grep's job |
grep -rn "pattern" path/ |
| Arbitrary Line Ranges | Beyond our scope | sed -n '10,50p' file |
| File Globbing | Shell handles this | batless *.py (shell expands) |
| Interactive Paging | We're non-blocking | Use bat or less |
| Git Integration | Keep it simple | Use git diff or bat |
| File Management | Not a file browser | ls, find, fd |
| Text Editing | Viewer only | Use your editor |
❌ "batless is a drop-in replacement for bat" ✅ Reality: batless is purpose-built for automation and AI, not interactive use
❌ "batless should add grep-like search"
✅ Reality: Unix philosophy - do one thing well. Use grep for searching
❌ "batless needs more features like bat" ✅ Reality: Less is more. Our constraints are features for automation
- 👤 Interactive code review: Use
bat- it has better human-focused features - 🔍 Searching code: Use
grep,rg(ripgrep), orag(silver searcher) - 📝 Editing files: Use your favorite editor
- 📊 Complex analysis: Use language-specific tools (pylint, rust-analyzer, etc.)
- 🎨 Pretty printing: Use
batwith its full decoration suite
Do ONE thing well: produce structured, machine-readable code analysis that
AI assistants can't generate themselves. For everything else — plain viewing,
searching, interactive use — there's already a better tool.
# Syntax highlighted output
batless main.rs
# Plain text (no colors)
batless --plain main.rs
# With line numbers
batless -n main.rs
# Limit output
batless --max-lines=50 large-file.py
batless --max-bytes=10000 huge-file.log# JSON output for LLM processing
batless --mode=json --include-tokens --summary src/main.rs | jq
# Extract code structure only
batless --mode=summary src/*.rs
# CI/CD context generation
batless --mode=json --max-lines=100 failing-test.rs > context.json
# Machine-readable metadata
batless --version-jsonJSON structure tips:
linesalways contains the full file content (even when--summaryis enabled), whilesummary_linescarries the condensed view. The payload now exposestotal_lines_exact,token_count, andtokens_truncatedso downstream tools can distinguish between fully processed files and sampled metadata.
# Use as PAGER replacement
PAGER="batless --plain" gh pr view 42
# Process multiple files
find src -name "*.rs" -exec batless --mode=summary {} \;
# Combine with grep
grep -l "TODO" src/*.py | xargs batless -n
# Stream stdin
cat file.rs | batless --language=rust# Use AI-optimized profile
batless --profile=claude main.rs
# Interactive configuration wizard
batless --configure
# List available profiles
batless --list-profilesbatless supports multiple color themes for syntax highlighting:
# List available themes
batless --list-themes
# Use specific theme
batless --theme="Solarized (dark)" file.pybatless currently includes 7 carefully curated themes:
- InspiredGitHub - Clean, GitHub-inspired light theme
- Solarized (dark) - Popular dark theme with excellent contrast
- Solarized (light) - Light variant of the Solarized theme
- base16-eighties.dark - Retro 80s-inspired dark theme
- base16-mocha.dark - Warm, chocolate-toned dark theme
- base16-ocean.dark - Cool, oceanic dark theme
- base16-ocean.light - Light variant of the ocean theme
Try different themes to find the one that works best for your workflow:
# Try each theme with your code
batless --theme="InspiredGitHub" examples/theme-showcase.rs
batless --theme="Solarized (dark)" examples/theme-showcase.rs
batless --theme="base16-mocha.dark" examples/theme-showcase.rsNote: Theme examples are available in docs/themes/ and can be regenerated with
./scripts/generate-theme-showcase.sh
# Auto-detect (default)
batless file.txt
# Force specific language
batless --language=python unknown.file
# List supported languages
batless --list-languagesCreate custom profiles in ~/.batless/profiles/:
# ~/.batless/profiles/my-profile.toml
name = "my-profile"
max_lines = 1000
summary_level = "medium"
include_tokens = trueUse with:
batless --custom-profile ~/.batless/profiles/my-profile.toml file.rsbatless includes built-in shell completion support for bash, zsh, fish, and PowerShell.
# Generate and install completions
batless --generate-completions bash > ~/.local/share/bash-completion/completions/batless
# Or for system-wide installation
sudo batless --generate-completions bash > /usr/share/bash-completion/completions/batless
# Then reload your shell or source the completion file
source ~/.local/share/bash-completion/completions/batless# Generate and install completions
batless --generate-completions zsh > ~/.zsh/completions/_batless
# Add to your ~/.zshrc (if not already present)
fpath=(~/.zsh/completions $fpath)
autoload -Uz compinit && compinit
# Then reload your shell
exec zsh# Generate and install completions
batless --generate-completions fish > ~/.config/fish/completions/batless.fish
# Completions are automatically loaded in new fish shells# Generate and add to your profile
batless --generate-completions powershell | Out-String | Invoke-Expression
# Or save to your profile for persistence
batless --generate-completions powershell >> $PROFILE--mode <MODE>- Output mode:plain,json,summary,index,ast--plain- Plain text output (equivalent to--mode=plain)--mode=json- Structured JSON output for automation--mode=summary- Extract only key code structures--mode=index- Machine-readable symbol table (kind, name, line ranges, visibility); pass a directory to walk it and emit one NDJSON line per file--mode=ast- Raw tree-sitter parse tree as JSON (Rust, Python, JavaScript, TypeScript, TSX;"root": nullfor other languages)
--max-lines <N>- Limit output to N lines--max-bytes <N>- Limit output to N bytes--lines <START:END>- Select specific line range (e.g.,10:50,:100,50:)
-n, --number- Show line numbers (cat -n compatibility)-b, --number-nonblank- Number non-blank lines only (cat -b compatibility)--language <LANG>- Force specific language syntax
--include-identifiers- Include extracted code identifiers in JSON output (--include-tokensstill works as alias)--with-line-numbers- JSONlinesarray uses{"n": N, "text": "..."}objects instead of plain strings--hash- Include SHA-256 content hash in JSON output (for change detection)--strip-comments- Strip comment-only lines from output--strip-blank-lines- Strip blank lines from output--chunk-strategy <STRATEGY>- Streaming chunk strategy:line(default) orsemantic(splits at top-level declaration boundaries for Rust/Python/JS/TS)--summary- Add code summary to JSON output--profile <PROFILE>- Use AI-optimized profile (claude20K lines,claude-max150K lines,copilot,chatgpt,gemini,assistant)--custom-profile <PATH>- Load custom profile from file
When using --mode=json, the output includes:
| Field | Type | Description |
|---|---|---|
file |
string | File path |
language |
string|null | Detected language |
lines |
array | File lines (strings, or {"n","text"} objects with --with-line-numbers) |
total_lines |
integer | Line count in original file |
total_lines_exact |
boolean | Whether total_lines covers the full file |
total_bytes |
integer | File size in bytes |
truncated |
boolean | Whether output was truncated |
encoding |
string | Detected encoding |
summary_lines |
array|null | Summary items {line, line_number, end_line, kind} |
identifiers |
array|null | Extracted code identifiers (with --include-identifiers) |
identifier_total |
integer|null | Total identifier count |
file_hash |
string|null | SHA-256 hex digest (with --hash) |
estimated_llm_tokens |
integer|null | Heuristic LLM token estimate (when profile active) |
token_model |
string|null | Model used for token estimation |
compression_ratio |
number|null | original/stripped lines ratio (with --strip-* flags) |
When using --mode=index, the output includes:
| Field | Type | Description |
|---|---|---|
file |
string | File path |
language |
string|null | Detected language |
symbol_count |
integer | Number of symbols found |
symbols |
array | Symbol table entries |
symbols[].kind |
string | function, struct, class, impl, trait, etc. |
symbols[].name |
string | Symbol identifier name |
symbols[].line_start |
integer | 1-based start line |
symbols[].line_end |
integer|null | 1-based end line |
symbols[].signature |
string | First declaration line |
symbols[].visibility |
string|null | pub, private, export, local |
When using --mode=ast, the output includes:
| Field | Type | Description |
|---|---|---|
file |
string | File path |
language |
string|null | Detected language |
mode |
string | "ast" |
parser |
string | "tree-sitter-rust" etc., or "none" for unsupported languages |
total_lines |
integer | Line count |
total_bytes |
integer | File size in bytes |
root |
object|null | Root parse tree node; null when parser is "none" |
root.type |
string | Node kind (e.g., "source_file", "module") |
root.start |
[row, col] | 0-based start position |
root.end |
[row, col] | 0-based end position |
root.text |
string|null | Node text for leaf nodes (≤256 chars) |
root.children |
array|null | Child nodes (same shape, max depth 64) |
root.is_error |
boolean|null | Present and true for error recovery nodes |
--list-languages- Show all supported languages
--version- Show version information--version-json- Machine-readable version metadata--help- Show detailed help information
batless is designed to work seamlessly with AI coding assistants:
# Use batless in Claude Code workflows
batless --profile=claude --max-lines=500 src/main.rs# Generate context for Copilot
batless --mode=json --summary src/ | gh copilot suggest# Generate structured context
batless --mode=json --include-tokens --max-lines=1000 file.rs > context.jsonSee docs/AI_INTEGRATION.md for detailed integration guides.
batless is built with:
- Rust - Memory safety and performance
- syntect - Syntax highlighting engine
- Streaming architecture - Memory-efficient processing
- Modular design - Clean separation of concerns
See docs/ARCHITECTURE.md for technical details.
We welcome contributions! Please see:
- CONTRIBUTING.md - Contribution guidelines
- CODE_OF_CONDUCT.md - Community standards
- docs/PHILOSOPHY_AND_SCOPE.md - Project philosophy
# Clone repository
git clone https://github.com/docdyhr/batless.git
cd batless
# Build
cargo build
# Run tests
cargo test
# Run with example
cargo run -- src/main.rs- Startup time: <5ms typical on modern hardware
- Binary size: ~2MB (minimal dependencies)
- Memory usage: Constant (streaming architecture)
- Throughput: Limited only by syntax highlighting speed
Note: Performance varies by hardware. Benchmarks on typical developer workstation.
MIT License - see LICENSE for details.
- Documentation: docs/
- Changelog: CHANGELOG.md
- Releases: GitHub Releases
- Issues: GitHub Issues
- Crates.io: crates.io/crates/batless
Built with ❤️ for automation, AI assistants, and modern CLI workflows