Monitor GitHub fork constellations with AI-powered analysis.
ForkHub watches the forks around your GitHub repositories, uses a Claude AI agent to classify what changed and why, and surfaces interesting divergences through digest notifications. Think of it as a satellite view of all the gardens growing from your code — whether the gardeners sent a letter or not.
- Python 3.12+
- uv package manager
- A GitHub personal access token (for API access)
- An Anthropic API key (for AI-powered analysis)
# Core install (tracking, syncing, digests, clustering)
uv tool install forkhub
# With Claude-powered features (AI analysis + agentic backfill test-fixer)
uv tool install 'forkhub[claude]'
# Or with pip
pip install forkhub # core
pip install 'forkhub[claude]' # with Claude integrationThe [claude] extra enables AI-powered features that use Anthropic's
claude-agent-sdk: fork change classification during sync, and the
agentic test-fixer for backfill --auto-fix-tests. ForkHub's core tracking,
syncing, and digest features work without it — and you can plug in your
own TestFixer implementation (OpenAI, local models, rule-based) via the
Python API or drive backfill from external agents via the CLI primitives.
For development:
git clone https://github.com/joshuaoliphant/forkhub.git
cd forkhub
uv syncThe easiest way is to create a .env file in your project directory:
cp env.example .env
# Edit .env with your tokensForkHub automatically loads .env files at startup. You can also export environment variables directly:
export GITHUB_TOKEN="ghp_..."
export ANTHROPIC_API_KEY="sk-ant-..."Or use a TOML config file:
mkdir -p ~/.config/forkhub
cp forkhub.toml.example ~/.config/forkhub/forkhub.tomlAuthentication options for Anthropic:
ANTHROPIC_API_KEY— standard API keyCLAUDE_ACCESS_TOKEN— OAuth token fromclaude set-token(used by Claude Code)
Either works. If both are set, the API key takes precedence.
# Discover and track your repos
uv run forkhub init --user your-github-username
# See what's tracked
uv run forkhub repos
# Sync fork data from GitHub (and classify changes with the AI analyzer)
uv run forkhub sync
# Skip analysis — discovery only, no Anthropic API cost
uv run forkhub sync --no-analyze
# View forks for a specific repo
uv run forkhub forks owner/repo
# Generate a digest of interesting changes
uv run forkhub digest| Command | Description |
|---|---|
forkhub init --user <username> |
Discover and track your GitHub repos |
forkhub track <owner> <repo> |
Track a repository you don't own |
forkhub untrack <owner> <repo> |
Stop tracking a repository |
forkhub exclude <owner> <repo> |
Exclude a repo from tracking |
forkhub include <owner> <repo> |
Re-include an excluded repo |
forkhub repos |
List tracked repositories |
forkhub forks <owner> <repo> |
List forks of a tracked repo |
forkhub inspect <owner> <repo> |
Detailed view of a single fork |
forkhub clusters <owner> <repo> |
Show signal clusters (similar changes across forks) |
forkhub sync |
Sync fork data from GitHub and run AI analysis on changed forks |
forkhub sync --no-analyze |
Sync without running the AI analyzer (no Anthropic API cost) |
forkhub digest |
Generate and deliver a change digest |
forkhub backfill run |
Run the autonomous backfill loop |
forkhub backfill list |
List previous backfill attempts |
forkhub backfill candidates |
List signals eligible for backfill |
forkhub backfill apply <signal-id> |
Apply a patch to a candidate branch, run tests |
forkhub backfill status <attempt-id> |
Inspect a backfill attempt |
forkhub backfill record <attempt-id> |
Record the outcome of an attempt |
forkhub backfill cleanup <attempt-id> |
Delete candidate branch, return to original |
forkhub backfill read-failures |
Run tests, return failing test file contents |
forkhub backfill write-test <path> |
Safety-gated test file write (stdin or --content) |
forkhub backfill run-tests |
Run the configured test command |
forkhub config show |
Show current configuration |
Breaking change in v0.3.0:
forkhub backfillandforkhub backfill-listare nowforkhub backfill runandforkhub backfill list. The backfill subcommands (candidates/apply/status/record/cleanup/read-failures/write-test/run-tests) let any external agent — Claude Code, Cursor, Aider, local models, shell scripts, humans — drive the backfill test-fix loop without requiring the[claude]extra.
ForkHub is a library first. The CLI is a thin consumer.
import asyncio
from forkhub import ForkHub
async def main():
async with ForkHub() as hub:
# Discover your repos
repos = await hub.init("your-username")
# Sync fork data
result = await hub.sync()
print(f"Synced {result.repos_synced} repos, {result.total_changed_forks} changed forks")
# Get forks for a repo
forks = await hub.get_forks("owner", "repo", active_only=True)
# Generate and deliver a digest
digest = await hub.generate_digest()
await hub.deliver_digest(digest)
# Backfill valuable fork changes into your local repo
result = await hub.backfill("owner/repo", dry_run=True)
print(f"Evaluated {result.total_evaluated}, accepted {result.accepted}")
asyncio.run(main())All extension points use Python Protocol classes, so you can inject your own implementations:
from forkhub import ForkHub
hub = ForkHub(
git_provider=my_custom_provider, # implements GitProvider protocol
notification_backends=[my_slack_backend], # implements NotificationBackend protocol
embedding_provider=my_embeddings, # implements EmbeddingProvider protocol
)| Mode | Description |
|---|---|
| owned | Your repos, auto-discovered via init. Forks are monitored. |
| watched | Repos you don't own but want to observe via track. |
| upstream | Repos you've forked. Tracks upstream changes you might want. |
When ForkHub syncs, a Claude AI agent analyzes what changed in each fork and produces signals — classified changes with a significance score (1-10):
| Category | Description |
|---|---|
feature |
New functionality added |
fix |
Bug fix not yet in upstream |
refactor |
Structural/architectural change |
config |
Configuration or deployment change |
dependency |
Dependency swap or version change |
removal |
Feature or code removed |
adaptation |
Platform or environment adaptation |
release |
A new tagged release |
When multiple forks independently make similar changes, ForkHub detects these as clusters using vector similarity of signal embeddings. Clusters reveal community-wide trends — if three forks all swap the same dependency, that's a signal worth knowing about.
forkhub sync -> Discover forks (GitHub API)
-> Compare HEAD SHAs (skip unchanged)
-> AI agent classifies changes -> store signals
-> Update clusters via embedding similarity
forkhub digest -> Query recent signals
-> AI agent composes readable summary
-> Deliver via notification backends
forkhub backfill run -> Rank high-significance signals
-> Fetch diffs, apply patches to candidate branches
-> Run test suite to score results
-> Accept or reject based on test outcome
The forkhub backfill sub-app exposes composable primitives so any agent can
drive the test-fix loop manually. Example shell-driven flow:
# 1. Find a candidate
SIG=$(forkhub backfill candidates --json | jq -r '.candidates[0].signal_id')
# Optional: preview without touching git or running tests
forkhub backfill apply "$SIG" --dry-run --json
# 2. Apply — preserves candidate branch on test failure
forkhub backfill apply "$SIG" --json
# Exit codes: 0=passed, 1=tests failed, 2=conflict/patch failed,
# 3=fetch error, 4=signal not found
# 3. Inspect failing tests (runs the test command fresh each call)
forkhub backfill read-failures --json
# Exit codes: 0=tests passed, 1=tests failed, 124=timeout/spawn failure
# 4. Agent (any tool) produces fixed test content and pipes it in
cat fixed_test.py | forkhub backfill write-test tests/test_foo.py
# 5. Re-run tests (returncode propagated; -1 timeout maps to 124)
forkhub backfill run-tests --json
# 6. Record outcome (score is validated against 0.0-1.0)
ATTEMPT=$(forkhub backfill list --json | jq -r '.[0].id')
forkhub backfill record "$ATTEMPT" --status=accepted --score=0.9
# 7. Or discard the attempt — exit 2 if any git op was swallowed
forkhub backfill cleanup "$ATTEMPT"Every primitive outputs JSON when --json is passed. The Python service
layer enforces safety invariants (test-files-only, path traversal defense)
regardless of which agent is driving.
ForkHub looks for forkhub.toml in ~/.config/forkhub/ or the current directory. Environment variables override TOML values.
| Setting | Env Var | Default |
|---|---|---|
| GitHub token | GITHUB_TOKEN |
— |
| Anthropic API key | ANTHROPIC_API_KEY |
— |
| OAuth token | CLAUDE_ACCESS_TOKEN |
— |
| Analysis budget | — | $0.50 per sync |
| Analysis model | — | sonnet |
| Digest model | — | haiku |
| Sync interval | — | 6h |
| Digest frequency | — | weekly |
| Min significance | — | 5 |
| DB path | — | ~/.local/share/forkhub/forkhub.db |
ForkHub loads .env files automatically. See env.example for all supported variables.
See forkhub.toml.example for all options.
ForkHub is a library first — the CLI is a thin consumer. The core library (src/forkhub/) exposes the ForkHub class as its public API.
Extension points (Protocol-based, swappable at runtime):
GitProvider— fetches repo/fork data (default: GitHub via githubkit)NotificationBackend— delivers digests (default: Rich console output)EmbeddingProvider— text embeddings for clustering (default: local sentence-transformers)
AI analysis uses the Claude Agent SDK with a coordinator + subagent pattern:
- Coordinator agent gets tools to explore forks (list, summarize, diff)
- diff-analyst subagent deep-dives individual forks
- digest-writer subagent composes human-readable summaries
Storage: SQLite + sqlite-vec for vector similarity search.
# Install with dev dependencies
uv sync
# Run tests (155 tests)
uv run pytest
# Lint and format
uv run ruff check src/ tests/
uv run ruff format src/ tests/
# Type check
uv run ty checkMIT