Scavenger

Terminal-first AST dependency graph and session memory engine for AI coding agents. Reduces token usage by serving focused "capsules" instead of full files, and persists session memory anchored to code symbols across sessions.

Why Scavenger?

Got tired of watching people sell the same idea wrapped in marketing copy. Built this, made it free, open source it. No pitch, no waitlist, no pricing page.

Fewer tokens, better context — Instead of dumping entire files into the context window, Scavenger serves focused capsules: signatures, docstrings, call graphs, and dependency neighborhoods — only what the agent needs.
Cross-session memory — Annotations (facts, strategies, pitfalls) are anchored to symbols and persist across sessions. Your agent picks up where it left off. This still has a way to go, I am still testing it out.
Simple setup — Install with a single command, run scavenger init, and you're done.
Real dependency graph — Built with tree-sitter over 15 languages. Answers "what calls X?", "what does X call?", and "what breaks if X changes?" in milliseconds, without grep.
Branch-aware index — Each git branch gets its own SQLite database, so context always matches your current branch.
Federated repos (work in progress) — Query symbols from linked repositories as if they were local.

Benchmarks

We ran an A/B comparison — same model (Claude 4.6 Opus), same 3-turn prompt sequence, same codebase (~4k lines Rust, 41 source files). One session had Scavenger enabled, one used only native tools (Grep, Read, Glob, Shell). Full methodology and turn-by-turn analysis: benchmark/report.md.

Session totals (3 turns)

Metric	Without Scavenger	With Scavenger	Delta
Total tool calls	37	18	-51%
File reads	13	7	-46%
Input tokens	706.8k	388.6k	-45%
Output tokens	12.4k	3.4k	-72%
Wall time	238s	86s	-64%

How the savings compound

The savings are not uniform across turns — they compound:

Turn	Task	Token delta	Tool call delta
T1	Explore a subsystem	+78% (investment)	+150%
T2	Analyze impact of a rename	+38% (narrowing)	0%
T3	Execute the rename	-68% (payoff)	-83%

Turn 1 invests in structural understanding via capsules (signatures, call graphs, dependency neighborhoods). By Turn 3 the agent already knows the dependency structure and needs 88% fewer file reads, 94% fewer output tokens, and finishes in 17s vs 185s.

When is Scavenger worth it?

Good fit:

Multi-turn sessions (explore → analyze → modify → verify) — this is where the graph investment pays off
Medium-to-large codebases where re-navigation is expensive (the bigger the project, the more tokens wasted re-reading files)
Refactoring and impact analysis tasks — "what breaks if I change X?" is answered by the graph in milliseconds
Repeated work across sessions — annotations persist, so the agent doesn't start from zero next time

Not worth it:

Single-shot questions ("what does this function do?") — capsules add overhead with no payback window
Tiny projects where the whole codebase fits in context anyway
Write-only tasks with no exploration phase (e.g. "add this exact function to this exact file")

The break-even point in our benchmark was ~3 turns. If your session is shorter than that, you may not recoup the initial capsule investment.

Note: This is an N=1 benchmark on a single codebase. Results will vary by project size, prompt complexity, and model. We ran it to validate the approach, not to claim universal numbers. Run benchmark/benchmark.py on your own project to see your numbers.

Supported Agents

Agent	Integration	Status
Claude Code	Hooks + MCP bridge	Tested
Cursor	Hooks + MCP bridge	Tested
Other MCP tools	MCP bridge only	Untested — see below

Scavenger has two integration layers:

Hooks — Automatically inject capsules on file reads, trigger re-indexing on writes, and manage daemon lifecycle. This is the full experience but requires the agent to support hooks. Currently only Claude Code and Cursor have hook support.
MCP bridge — Exposes tools (get_capsule, read_annotations, etc.) that the agent calls explicitly. Any MCP-compatible agent can use this, but the agent won't automatically receive capsules or trigger re-indexing — it has to call the tools itself.

Agents that only support MCP (Windsurf, Continue, Amp, etc.) get the MCP bridge but not hooks. The tools work, but the experience is less seamless. If you try Scavenger with another tool, please open a discussion and let us know how it went.

Platform Support

Platform	Status
Linux x86_64	Supported (pre-built binary)
macOS x86_64	Supported (pre-built binary)
macOS Apple Silicon	Supported (pre-built binary)
Windows	Not natively supported — Scavenger relies on Unix domain sockets for daemon communication. Use WSL2 instead.

Installation

Pre-built binary (no Rust required)

Download the binary for your platform from the latest release:

Platform	Asset
Linux x86_64	`scavenger-x86_64-linux.tar.gz`
macOS x86_64	`scavenger-x86_64-macos.tar.gz`
macOS Apple Silicon	`scavenger-aarch64-macos.tar.gz`

# Example: Linux
curl -L https://github.com/Dalot/scavenger/releases/latest/download/scavenger-x86_64-linux.tar.gz | tar xz
sudo mv scavenger /usr/local/bin/

From crates.io (requires Rust 1.85+)

cargo install thescavenger

From source

cargo install --path .

Verify:

scavenger --version

Quick Start

# 1. Go to your project directory
cd your-project

# 2. Initialize Scavenger (indexes files, registers hooks and MCP config)
scavenger init

That's it. scavenger init automatically:

Indexes all source files (15 languages) and markdown docs into a per-branch SQLite graph
Creates the Claude Code plugin at .scavenger/claude-plugin/
Registers the MCP bridge via claude mcp add (if claude CLI is on PATH)
Writes .mcp.json for any other MCP-compatible tool
Writes .cursor/mcp.json and .cursor/hooks.json for Cursor
Adds .scavenger/ to .gitignore

The daemon starts and stops automatically with each agent session — no manual management needed. You can also control it explicitly with scavenger daemon start, scavenger daemon stop, and scavenger daemon status.

Agent Setup

Claude Code

After scavenger init, launch Claude Code with the plugin flag:

claude --plugin-dir .scavenger/claude-plugin/

If the claude CLI wasn't found during init, register the MCP bridge manually:

claude mcp add scavenger -- scavenger mcp-bridge

Cursor

After scavenger init, reload the Cursor window so it picks up the new .cursor/mcp.json and .cursor/hooks.json files. Open the command palette (Ctrl+Shift+P / Cmd+Shift+P) and run Developer: Reload Window. You only need to do this once — after the initial reload, everything works automatically.

Other MCP-compatible tools

scavenger init writes a .mcp.json at the project root. Any tool that reads this file will discover the MCP bridge. If your tool doesn't read .mcp.json automatically, point it at scavenger mcp-bridge as the command for a stdio-based MCP server.

How It Works

Hooks (Claude Code & Cursor)

Hooks give agents the full Scavenger experience automatically:

On file reads — Injects a capsule (signatures, docstrings, call graph neighborhood) so the agent sees focused context instead of raw file content.
On file writes — Incrementally re-indexes the changed file within a debounce window.
On session start/end — Starts and stops the daemon automatically.

MCP bridge tools

Available in any MCP-connected agent as callable tools. Agents without hook support use these directly.

Tool	Description
`get_capsule`	Pass a file path for focused context, or a symbol name to get callers, callees, and breakage impact.
`read_annotations`	Retrieve persisted memory: facts, strategies, pitfalls anchored to symbols, files, or scopes.
`write_annotation`	Persist a note anchored to a symbol, file, or scope. Survives across sessions and branches.
`delete_annotation`	Remove an annotation by ID.
`search_docs`	Search over indexed markdown docs and code.

Session memory

Annotations — Agent-written notes (facts, strategies, pitfalls), anchored to symbols.
Behavioral signals — Auto-captured metrics: which files were touched, token savings per session.
Version history — Annotation edits are versioned; annotations from other branches can be merged.

CLI Reference

Command	Description
`scavenger init`	Initialize on a project (index + register hooks for all agents)
`scavenger daemon start`	Start the daemon in foreground (normally auto-managed by hooks)
`scavenger daemon stop`	Stop a running daemon
`scavenger daemon restart`	Stop and restart the daemon
`scavenger daemon status`	Show daemon status (running, PID, branch, graph size)
`scavenger index [path]`	Manually re-index files
`scavenger capsule <file> [symbol]`	Print a capsule to stdout
`scavenger graph stats`	Show node/edge counts and top centrality
`scavenger graph show <symbol>`	ASCII neighborhood tree
`scavenger annotate <symbol> "<text>"`	Add an annotation
`scavenger memory --query "<text>"`	Search annotations via FTS5
`scavenger merge-annotations <branch>`	Merge annotations from another branch
`scavenger stats [--session] [--branch]`	Token savings report
`scavenger observe [--interval N]`	Live observability dashboard (TUI)
`scavenger doctor [--format=json]`	Health diagnostics
`scavenger db summary`	Node/edge/file/annotation counts, DB sizes
`scavenger db nodes [--limit N]`	List indexed symbols
`scavenger db files [--limit N]`	List indexed files
`scavenger db annotations [--limit N]`	List annotations
`scavenger db tokens [--limit N]`	Show token_log entries
`scavenger db query "SQL" [--meta]`	Run read-only SQL against the DB
`scavenger federate add/remove/list/verify`	Manage federated repos (work in progress)
`scavenger clean [--purge]`	Remove plugin and legacy config (--purge removes all data)

Configuration

Create .scavenger.toml in your project root (optional — sensible defaults apply):

[capsule]
token_budget = 8000        # Max tokens per capsule (default: 8000)

[traversal]
max_hops = 3               # BFS hop limit (default: 3)
node_budget = 100          # Max nodes to traverse (default: 100)
degree_cap = 30            # Skip high-degree utility nodes (default: 30)

[docs]
patterns = ["**/*.md"]     # Markdown patterns to index
exclude = ["node_modules", "target", ".git"]

Supported Languages

Rust, Python, TypeScript, TSX, JavaScript, JSX, Go, Java, C#, C, C++, Ruby, Bash, PHP, Swift

Note: Kotlin is not yet supported. The tree-sitter-kotlin crate requires tree-sitter <0.23, which is incompatible with our tree-sitter 0.25 dependency. Will be added when a compatible version is released.

Architecture

Claude Code / Cursor / MCP tools
        ↓
Hooks (CLI) ←→ UDS Socket ←→ Daemon
                                ├── Graph (petgraph + tree-sitter)
                                ├── SQLite (per-branch index, WAL)
                                ├── Capsule Pipeline (6-stage)
                                ├── Memory (3-layer model)
                                ├── File Watcher (notify)
                                └── Federation (read-only)

Troubleshooting

Run scavenger doctor to check:

Daemon process alive
Socket accessible
DB integrity
Hook registration
Config validity

Set NO_COLOR=1 for plain output. Use --format=json for machine-readable diagnostics.

Contributing

Contributions are welcome. See CONTRIBUTING.md for development setup, coding standards, and the PR process. Check CHANGELOG.md for what has changed between releases.

License

Licensed under either of

Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.github		.github
benchmark		benchmark
docs		docs
eval		eval
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.scavenger.toml.example		.scavenger.toml.example
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
deny.toml		deny.toml
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scavenger

Why Scavenger?

Benchmarks

Session totals (3 turns)

How the savings compound

When is Scavenger worth it?

Supported Agents

Platform Support

Installation

Pre-built binary (no Rust required)

From crates.io (requires Rust 1.85+)

From source

Quick Start

Agent Setup

Claude Code

Cursor

Other MCP-compatible tools

How It Works

Hooks (Claude Code & Cursor)

MCP bridge tools

Session memory

CLI Reference

Configuration

Supported Languages

Architecture

Troubleshooting

Contributing

License

About

Licenses found

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 2

Languages

Folders and files

Latest commit

History

Repository files navigation

Scavenger

Why Scavenger?

Benchmarks

Session totals (3 turns)

How the savings compound

When is Scavenger worth it?

Supported Agents

Platform Support

Installation

Pre-built binary (no Rust required)

From crates.io (requires Rust 1.85+)

From source

Quick Start

Agent Setup

Claude Code

Cursor

Other MCP-compatible tools

How It Works

Hooks (Claude Code & Cursor)

MCP bridge tools

Session memory

CLI Reference

Configuration

Supported Languages

Architecture

Troubleshooting

Contributing

License

About

Resources

License

Licenses found

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 2

Languages

Packages