repo2ctx

Intelligently prepare your codebase as LLM context — not just a file dump.

repo2ctx analyzes your codebase and produces optimized context for LLM consumption. Unlike naive directory dumps, it uses dependency analysis, file importance scoring, and configurable focus areas to produce Markdown/XML/JSON context that fits within token budgets while maximizing information density.

Features

Intelligent file selection — Scores files by import graph centrality (PageRank), recent git activity, and configurable focus areas
Token-aware budgeting — Accurate token counting via tiktoken (OpenAI) or character-based estimation (Claude), with configurable budget
Dependency-aware ordering — Topological sort ensures dependencies appear before dependents for better LLM comprehension
Focus mode — Zero in on specific files/directories plus their import neighbors
Smart truncation — When files exceed their budget, preserves signatures, class headers, and docstrings while removing method bodies
Multi-language support — tree-sitter parsing for Python, JavaScript, TypeScript, Go, and Rust
Multiple output formats — Markdown (default), XML (Claude-optimized), and JSON

Installation

pip install repo2ctx

Or with uv:

uv tool install repo2ctx

Quick Start

# Analyze entire repo with default 128K token budget
repo2ctx .

# Set explicit token budget
repo2ctx . --max-tokens 100000

# Focus on a specific module
repo2ctx . --focus src/auth/

# Claude-optimized XML output
repo2ctx . --format xml

# Filter files
repo2ctx . --include '*.py' --exclude tests/

# Write to file
repo2ctx . --output context.md

# Use Claude token counting
repo2ctx . --model claude

CLI Reference

repo2ctx PATH [OPTIONS]

Arguments:
  PATH                    Root directory to analyze

Options:
  -t, --max-tokens INT    Token budget (default: 128000)
  -f, --focus PATH        Focus on specific file/directory (repeatable)
  --format FORMAT         Output format: markdown, xml, json (default: markdown)
  -i, --include GLOB      Include only matching files (repeatable)
  -e, --exclude GLOB      Exclude matching files (repeatable)
  -m, --model MODEL       Token model: openai or claude (default: openai)
  -o, --output FILE       Write output to file instead of stdout
  --help                  Show help message

How It Works

Discovery — Walks the directory tree, respecting .gitignore, skipping binary files and common non-code directories
Import Analysis — Uses tree-sitter to extract import graphs for supported languages
Scoring — Combines PageRank centrality (how many files import this?), git recency (recently modified = more relevant), and focus proximity
Budget Allocation — Distributes token budget proportionally to file scores
Smart Truncation — Files exceeding their allocation are truncated intelligently: signatures and docstrings are preserved, method bodies are replaced with ...
Output — Formats everything as Markdown, XML, or JSON with file tree, summary stats, and ordered file contents

Python API

from pathlib import Path
from repo2ctx.pipeline import run

# Get context as a string
context = run(
    root=Path("."),
    max_tokens=100_000,
    focus=["src/auth/"],
    fmt="markdown",
    model="openai",
)

Supported Languages

Language	Import Extraction	Smart Truncation
Python	`import`, `from...import`	Functions, classes
JavaScript	`import`, `require()`	Functions, classes
TypeScript	`import`, `require()`	Functions, classes
Go	`import`	Functions
Rust	`use`, `extern crate`	Functions, impls

Other file types are included but without import analysis or structural truncation.

Development

# Clone and install
git clone https://github.com/your-org/repo2ctx.git
cd repo2ctx
uv sync

# Run tests
uv run pytest

# Run linter
uv run ruff check .

# Format code
uv run ruff format .

Contributing

Fork the repository
Create a feature branch (git checkout -b feat/my-feature)
Write tests for your changes
Ensure all tests pass (uv run pytest)
Ensure linting passes (uv run ruff check .)
Commit with descriptive messages (feat: add my feature)
Open a pull request

License

MIT — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
src/repo2ctx		src/repo2ctx
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
PLAN.md		PLAN.md
PROMPT.md		PROMPT.md
README.md		README.md
SPEC.md		SPEC.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

repo2ctx

Features

Installation

Quick Start

CLI Reference

How It Works

Python API

Supported Languages

Development

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

repo2ctx

Features

Installation

Quick Start

CLI Reference

How It Works

Python API

Supported Languages

Development

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages