GitHub - stevengonsalvez/agents-in-a-box: context engineering for agentic coding

   ╔═══════════════════════════════════════════════════════════════╗
   ║                                                               ║
   ║     █████╗  ██████╗ ███████╗███╗   ██╗████████╗███████╗       ║
   ║    ██╔══██╗██╔════╝ ██╔════╝████╗  ██║╚══██╔══╝██╔════╝       ║
   ║    ███████║██║  ███╗█████╗  ██╔██╗ ██║   ██║   ███████╗       ║
   ║    ██╔══██║██║   ██║██╔══╝  ██║╚██╗██║   ██║   ╚════██║       ║
   ║    ██║  ██║╚██████╔╝███████╗██║ ╚████║   ██║   ███████║       ║
   ║    ╚═╝  ╚═╝ ╚═════╝ ╚══════╝╚═╝  ╚═══╝   ╚═╝   ╚══════╝       ║
   ║              ██╗███╗   ██╗    █████╗                              ║
   ║              ██║████╗  ██║   ██╔══██╗                             ║
   ║              ██║██╔██╗ ██║   ███████║                             ║
   ║              ██║██║╚██╗██║   ██╔══██║                             ║
   ║              ██║██║ ╚████║   ██║  ██║                             ║
   ║              ╚═╝╚═╝  ╚═══╝   ╚═╝  ╚═╝                             ║
   ║            ██████╗  ██████╗ ██╗  ██╗                              ║
   ║            ██╔══██╗██╔═══██╗╚██╗██╔╝                              ║
   ║            ██████╔╝██║   ██║ ╚███╔╝                               ║
   ║            ██╔══██╗██║   ██║ ██╔██╗                               ║
   ║            ██████╔╝╚██████╔╝██╔╝ ██╗                              ║
   ║            ╚═════╝  ╚═════╝ ╚═╝  ╚═╝                              ║
   ║                                                               ║
   ╚═══════════════════════════════════════════════════════════════╝

A complete ecosystem for AI-assisted development

115 Rust Modules · 71 Skills · 37 Agents · 9 AI Tools · Knowledge Graph

A terminal-native ecosystem for managing AI coding agents. Built around a Rust TUI that orchestrates Claude Code, Codex, Gemini, and Copilot sessions with git worktree isolation, and a portable toolkit of skills, agents, and workflows that plug into 9 different AI coding tools.

Live dashboard: multi-workspace sidebar, session preview pane, and tmux-backed persistent sessions

Built-in usage analytics: 11.9B tokens tracked across 45 projects and 487 sessions, by provider and by day

What's Inside

Component	What it does	Scale
ainb TUI	Rust terminal app for managing Claude Code sessions	115 modules
Toolkit	Portable skills, agents, and workflows for AI coding tools	71 skills, 37 agents
Knowledge System	GraphRAG + QMD learning capture and retrieval	Architecture docs

Why agents-in-a-box?

Most AI coding setups are a loose collection of dotfiles. This project treats the problem as an engineering system:

One toolkit, many tools — Write a skill once, deploy it to Claude Code, Codex, Gemini, Cursor, Copilot, Amazon Q, Cline, Roo, or Clawdhub
Session isolation — Each coding session gets its own git worktree and tmux session. No cross-contamination
Agents that compose — 37 specialized agents (backend-developer, security-agent, architecture-reviewer, etc.) that can be orchestrated into swarms
Memory that persists — A two-tier knowledge system (GraphRAG + QMD) that captures learnings and retrieves them across sessions and projects
Production Rust — The TUI isn't a shell script. It's 115 modules of typed, tested, async Rust with clippy pedantic/nursery lints

Quick Start

# Install the TUI
brew tap stevengonsalvez/ainb && brew install ainb

# Install the toolkit for your AI tool
cd toolkit && npm install && node create-rule.js --tool=claude-code-4.5

# Launch
ainb

ainb — Terminal UI + CLI

A Rust-based terminal application for managing AI coding sessions with git worktree isolation, model selection, and persistent tmux sessions. Every operation is available as both an interactive TUI view and a scriptable CLI subcommand with JSON output — so humans drive it from a dashboard and agents drive it from shell scripts.

Feature Highlights

Multi-provider — Run Claude Code, Codex CLI, Gemini CLI, or GitHub Copilot in the same workflow, with Sonnet / Opus / Haiku selection per session
Git worktree isolation — Each session runs in its own branch and working directory. No cross-contamination, no stash dance
tmux persistence — Sessions survive terminal disconnects, SSH drops, and laptop sleep. Reattach any time
Usage analytics — Built-in token + session tracking by day, week, provider, and project. Know where your budget went
Easy onboarding — First-run setup wizard checks dependencies, configures auth, and gets you creating sessions in minutes
Live log streaming — Real-time viewer with level filtering and search across all running sessions
Scriptable CLI — 15 commands with --format json output for every piece of state. 📘 Full CLI reference →

Feature Showcase

📊 Unified dashboard Sidebar navigation across Agents, Catalog, Sessions, Recovery, Logs, Stats, Changelog, and Setup. Keyboard-driven throughout.	🤖 Pick your agent, pick your model Choose between Claude Code, Shell Only, SSH, Codex CLI, Gemini CLI, GitHub Copilot, or Kiro. Model toggle — Sonnet · Opus · Haiku — right below.
🚀 Start a session any way you want Local repo, clone from GitHub/GitLab, SSH into a remote box, or pull from your Favorites. One-key shortcuts: L / R / S / F.	🛠️ Guided first-time setup Re-run the wizard, verify dependencies, configure git paths, set auth, pick your editor — or factory-reset in one click.
📈 Usage analytics, built in Daily / weekly / by-project views across all providers. Understand your token burn at a glance.	🎯 Per-project attribution See exactly which repos and worktrees consume your context budget. Input, cache, output, and session counts per project.

CLI — Scriptable Equivalent of Every TUI Feature

For agents, automation, and scripts, ainb ships a full CLI. Every command supports --format json for piping to jq.

ainb --help                             # Top-level overview
ainb run --repo . --worktree --tool claude --model sonnet
ainb list --format json | jq .
ainb logs my-session --follow
ainb recover list                       # Find orphaned sessions
ainb config set authentication.default_model opus
ainb completion zsh > ~/.zsh/completions/_ainb

15 top-level commands — run, list, logs, attach, status, kill, auth, recover, config, git, favorites, init, presets, completion, tui — with nested subcommands for recover / config / git / favorites / presets.

📘 Full CLI reference → ainb-tui/docs/CLI.md

Installation

Homebrew (macOS / Linux)

brew tap stevengonsalvez/ainb
brew install ainb

One-liner install

curl -fsSL https://raw.githubusercontent.com/stevengonsalvez/agents-in-a-box/v2/ainb-tui/install.sh | bash

Cargo (any platform)

cargo install --git https://github.com/stevengonsalvez/agents-in-a-box --branch v2 agents-box
# Optionally alias: alias ainb="agents-box"

Windows (WSL)

# 1. Install WSL2
wsl --install

# 2. Inside Ubuntu/Debian
curl -fsSL https://raw.githubusercontent.com/stevengonsalvez/agents-in-a-box/v2/ainb-tui/install.sh | bash
sudo apt update && sudo apt install -y tmux
ainb

ainb requires tmux for persistent sessions, which is Unix-only. WSL provides the best Windows experience.

Keyboard Shortcuts

Key	Action
`j/k` or `↑/↓`	Navigate sessions
`Enter`	Attach to session
`n`	New session
`d`	Delete session
`r`	Restart Claude in session
`l`	View logs
`q`	Quit

Platform Support

Platform	Status	Method
macOS Apple Silicon	✅	Pre-built binary
macOS Intel	✅	Build from source
Linux x86_64	✅	Pre-built binary
Linux ARM64	✅	Build from source
Windows (WSL2)	✅	Install script
Windows (Native)	❌	Use WSL

Requirements

tmux — persistent session management
git — worktree operations
Claude Code CLI — the claude command

Toolkit

A portable AI coding agent toolkit: skills, agents, workflows, and configurations that deploy to 9 different AI coding tools from a single source.

Full toolkit documentation →

Supported AI Tools

Tool	Deploy target	Method
Claude Code	`~/.claude/`	Home directory
Codex	`~/.codex/`	Home directory
GitHub Copilot	`~/.copilot/`	Home directory
Gemini CLI	`.gemini/`	Project directory
Amazon Q	`.amazonq/rules/`	Project directory
Cursor	Project root	Project directory
Cline	Project root	Project directory
Roo	Project root	Project directory
Clawdhub	Project root	Project directory

Skills (71)

Skills are reusable capabilities that any supported AI tool can invoke.

Workflow & Planning

plan · plan-tdd · plan-gh · implement · validate · workflow · brainstorm · critique · discuss · expose · interview

Code Quality & Testing

commit · find-missing-tests · webapp-testing · security-audit · security-scan · simplify

DevOps & Infrastructure

start-local · start-ios · start-android · spawn-agent · tmux-monitor · tmux-status · expose · debug-bridge

Knowledge & Learning

reflect · global-learnings · research · research-cache · instincts · compound-docs · prime

Session Management

health-check · session-info · session-metrics · session-summary · handover · recover-sessions · plugins

Swarm Orchestration

swarm-create · swarm-join · swarm-inbox · swarm-status · swarm-shutdown · swarm-orchestration · swarm-agent-troubleshooting

GitHub & Issues

gh-issue · make-github-issues · do-issues · merge-agent-work · list-agent-worktrees · attach-agent-worktree · cleanup-agent-worktree

Design & Frontend

ui-ux-pro-max · frontend-design · frontend-slides · tui-style-guide · tui-screen · liquid-glass · remotion-best-practices

Research & Analysis

crypto-research · oracle · notebooklm · sentry-cli · ats-resume-matcher · resume-formatter · retro-pdf

Agent Architecture

skill-creator · agent-ops · autonomous-loops · cost-aware-pipeline · media-processing · nano-banana-pro · sync-learnings · claude-developer-platform

Agents (37)

Specialized AI agents organized by domain. Each agent has a defined persona, tool access, and area of expertise.

Category	Agents
Universal	`backend-developer` · `frontend-developer` · `superstar-engineer`
Orchestrators	`tech-lead-orchestrator` · `project-analyst` · `team-configurator`
Engineering	`api-architect` · `architecture-reviewer` · `code-archaeologist` · `code-reviewer` · `dev-cleanup-wizard` · `devops-automator` · `documentation-specialist` · `gatekeeper` · `integration-tests` · `lead-orchestrator` · `migration` · `performance-optimizer` · `planner` · `playwright-test-validator` · `property-mutation` · `release-manager` · `security-agent` · `service-codegen` · `solution-architect` · `tailwind-css-expert` · `test-writer-fixer`
Design	`ui-designer`
Swarm	`swarm-leader` · `swarm-worker`
Meta	`agentmaker` · `reflect`
Root	`distinguished-engineer` · `web-search-researcher`

Knowledge System

A two-tier learning system that captures insights during development and retrieves them across sessions and projects.

Layer	Technology	Purpose
Fast local	QMD (Quick Markdown Documents)	Semantic search over structured learning notes
Deep graph	GraphRAG (nano-graphrag)	Entity-relationship graph with community detection for cross-project knowledge retrieval

The /reflect skill captures learnings. The /research and /prime skills retrieve them. The /global-learnings skill manages the knowledge base directly.

How the knowledge system works →

Architecture

agents-in-a-box/
│
├── ainb-tui/                   # Rust TUI application
│   ├── src/                    # 115 modules
│   │   ├── app/                #   State machine & event handling
│   │   ├── components/         #   TUI screen components
│   │   ├── widgets/            #   Reusable UI widgets
│   │   ├── docker/             #   Container management
│   │   ├── tmux/               #   Session & PTY integration
│   │   ├── git/                #   Worktree operations
│   │   ├── claude/             #   Claude API client
│   │   ├── models/             #   Data models
│   │   └── config/             #   Configuration handling
│   ├── deny.toml               #   License & security policy
│   ├── Formula/                #   Homebrew formula
│   └── install.sh              #   One-liner installer
│
├── toolkit/                    # Portable AI agent toolkit
│   ├── packages/
│   │   ├── skills/             #   71 reusable skills
│   │   ├── agents/             #   37 agent definitions
│   │   │   ├── universal/      #     Cross-stack specialists
│   │   │   ├── engineering/    #     Backend & infra agents
│   │   │   ├── orchestrators/  #     Team coordination
│   │   │   ├── design/         #     UI/UX specialists
│   │   │   ├── swarm/          #     Multi-agent coordination
│   │   │   └── meta/           #     Agent creation & reflection
│   │   ├── workflows/          #   Structured delivery workflows
│   │   └── utilities/          #   Shared utilities
│   ├── bootstrap.js            #   Multi-tool deployment engine
│   └── create-rule.js          #   CLI installer
│
├── docs/                       # Documentation
│   └── how-reflection-works.md #   Knowledge system architecture
│
└── .github/workflows/
    ├── ci.yml                  #   Rust CI (fmt, clippy, test, deny, machete)
    ├── toolkit-validation.yml  #   Toolkit structure & install validation
    └── release.yml             #   Cross-platform binary releases

CI/CD & Quality

Check	Tool	What it catches
Format	`rustfmt`	Style inconsistencies
Lint	`clippy` (pedantic + nursery)	Logic errors, anti-patterns, code smells
Test	`cargo-nextest` (Ubuntu + macOS)	Regressions across platforms
Security	`cargo-deny` (RustSec)	Known vulnerabilities in dependencies
Licenses	`cargo-deny`	Non-compliant dependency licenses
Dead deps	`cargo-machete`	Unused crate declarations
Toolkit structure	Custom validation	Package counts, template substitution, install verification

The Rust codebase enforces unsafe_code = "forbid" and runs clippy with pedantic, nursery, and cargo lint groups enabled.

Development

Building from source

cd ainb-tui
cargo build --release
./target/release/agents-box

Running tests

cd ainb-tui
cargo test                              # Unit tests
cargo test --features visual-debug      # With terminal output
cargo test --features vt100-tests       # VT100 screen verification
cargo nextest run                       # With nextest (parallel)

Linting & checks

cd ainb-tui
cargo fmt --check                       # Format check
cargo clippy --all-targets              # Lint
cargo deny check                        # Security + licenses

Installing the toolkit

cd toolkit
npm install
node create-rule.js --tool=claude-code-4.5    # Deploy to ~/.claude/
node create-rule.js --tool=gemini             # Deploy to .gemini/
node create-rule.js --tool=codex              # Deploy to ~/.codex/

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'feat: add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Links

License

MIT — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 467 Commits
.beads		.beads
.github/workflows		.github/workflows
.vscode		.vscode
ainb-tui		ainb-tui
docs		docs
plans		plans
research		research
toolkit		toolkit
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
flake.nix		flake.nix
handover-2026-04-20-reflect.md		handover-2026-04-20-reflect.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What's Inside

Why agents-in-a-box?

Quick Start

ainb — Terminal UI + CLI

Feature Highlights

Feature Showcase

CLI — Scriptable Equivalent of Every TUI Feature

Installation

Keyboard Shortcuts

Platform Support

Requirements

Toolkit

Supported AI Tools

Skills (71)

Agents (37)

Knowledge System

Architecture

CI/CD & Quality

Development

Building from source

Running tests

Linting & checks

Installing the toolkit

Contributing

Links

License

About

Uh oh!

Releases 12

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What's Inside

Why agents-in-a-box?

Quick Start

ainb — Terminal UI + CLI

Feature Highlights

Feature Showcase

CLI — Scriptable Equivalent of Every TUI Feature

Installation

Keyboard Shortcuts

Platform Support

Requirements

Toolkit

Supported AI Tools

Skills (71)

Agents (37)

Knowledge System

Architecture

CI/CD & Quality

Development

Building from source

Running tests

Linting & checks

Installing the toolkit

Contributing

Links

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages