Autonomous software engineering pipeline for Claude Code with 135+ specialized subagents.
Claude Code becomes a full SWE pipeline—from task analysis through implementation, review, and PR creation—driven primarily by markdown (policies, phase prompts, agent definitions, templates). The pack also ships an optional Node work engine (scripts/work-engine.cjs) so CI can enforce the same budgets, transitions, and artifact rules as /check without relying on chat alone.
Claude Code (recommended): add this repository as a plugin marketplace, install agentic-swe@agentic-swe-catalog, then open your target project in Claude Code. Pipeline commands, phases, and agents resolve from ${CLAUDE_PLUGIN_ROOT}/ (the plugin root). Per-work state lives under .worklogs/<id>/ in your project; /install can walk you through merging root CLAUDE.md and optional .gitignore for worklogs.
/plugin marketplace add agentic-swe/agentic-swe
/plugin install agentic-swe@agentic-swe-catalog
Local development of the pack: run Claude Code with claude --plugin-dir /path/to/this/repo from your target project (or enable the plugin from that directory).
Then start a task:
/work Add retry logic to the API client
See the installation guide and the Claude Code plugin for details, upgrades, and optional org knowledge files (AGENTS.md, docs/agentic-swe/).
First success in ~15 minutes: follow the Golden path (install → /work → .worklogs/ → approval gate). Socialize / pitch: Who this is for and Host support tiers (OpenCode + Antigravity Tier B vs Claude Code reference). Tiny demo repo: examples/golden-path-demo/ (scratch target + DEMO_SCRIPT.md).
What this is: a markdown workflow pack that runs inside Claude Code on your repo (phases, gates, evidence). It is not a hosted async coding agent or cloud sandbox—that is a different class of product (e.g. remote harnesses with triggers and isolated runners).
Agentic SWE is a workflow pack for Claude Code (markdown policies, phases, and agents)—not a hosted cloud runtime. More on the product and licensing:
Public site: GitHub Pages
| Topic | Docs |
|---|---|
| First run (~15 min) | Golden path |
| Who it is for (short matrix) | Adoption one-pager |
| Multi-IDE scope (Tier A–D) | Host support tiers |
| Who it is for and hero messaging | Product positioning |
| MIT and commercial strategy | Licensing |
| Distribution and hosting | Distribution |
| Troubleshooting | Troubleshooting |
/check quick reference |
Check commands |
| Catalog lint / router / CI | Catalog routing |
Marketing site (source): the Vite + React app lives in agentic-swe/agentic-swe-site (sibling repository). Long-form docs are src/content/docs/*.md there (rendered at /docs/* on Pages). Pushes to main in that repo run GitHub Actions (pages.yml) → https://agentic-swe.github.io/agentic-swe-site/.
The pipeline runs a state machine that routes tasks through analysis, design, implementation, review, and PR creation. At each phase, it automatically selects specialized subagents based on the languages, frameworks, and domains detected in your codebase — agents can also call other agents in the background when they need domain-specific expertise.
lean track (simple tasks)
┌─────────────────────────────────────────────────────┐
initialized -> feasibility -> lean-track-check -> lean-track-implementation -> validation -> pr-creation -> completed
|
┌───────────────┴────────────────┐
v v
standard track (medium) rigorous track (complex)
design -> verification -> test -> design -> design-review -> verification ->
implementation -> self-review -> test-strategy -> implementation -> self-review ->
validation -> pr-creation -> ... code-review -> permissions-check -> validation -> ...
Standard track skips the design panel, design-review, code-review, and permissions-check (see CLAUDE.md for exact allowed transitions and pipeline.track).
Human gates stop the pipeline at ambiguity-wait, approval-wait, and escalation states.
Example tasks and the routes they follow (see the state machine diagram above).
Lean track (small bug fix):
/work "Fix the off-by-one error in calculateTotal"
→ feasibility → lean-track-check (low risk) → lean-track-implementation → validation → pr-creation → approval-wait → completed
Rigorous track (new feature):
/work "Add user authentication with JWT tokens"
→ feasibility → lean-track-check (high risk) → design → design-review → verification → test-strategy → implementation → self-review → code-review → permissions-check → validation → pr-creation → approval-wait → completed
| Command | What it does |
|---|---|
/work <task> |
Start a new task (auto-routes lean, standard, or rigorous track) |
/work <id> |
Resume paused work |
/plan-only <task> |
Analyze and design without implementing |
/brainstorm |
Design-first exploration (design phase + optional visual server) |
/write-plan [id] |
Refine implementation.md plan to plan-quality bar (no coding) |
/execute-plan [id] |
Run the plan via implementation / lean-track-implementation |
/author-pipeline |
Checklist to extend phases, commands, agents safely |
/subagent |
Browse 135+ specialized subagents |
/subagent search <query> |
Find subagents by keyword |
/subagent invoke <name> <task> |
Spawn a specialist for a task |
/evaluate-work <id> |
Check work item health and status |
/repo-scan |
Structured codebase snapshot |
/check budget |
Verify iteration budgets |
See the usage reference for the full commands list.
135+ agents across 10 categories. Automatically selected during pipeline execution based on detected languages, frameworks, and domain signals — no manual invocation needed. Agents can also call other agents to get domain-specific work done.
| Category | Count | Examples |
|---|---|---|
| Core Development | 10 | backend-developer, fullstack-developer, api-designer |
| Language Specialists | 29 | python-pro, typescript-pro, rust-engineer, golang-pro |
| Infrastructure | 16 | kubernetes-specialist, terraform-engineer, docker-expert |
| Quality & Security | 14 | code-reviewer, security-auditor, penetration-tester |
| Data & AI | 13 | llm-architect, ml-engineer, data-engineer |
| Developer Experience | 13 | refactoring-specialist, mcp-developer, cli-developer |
| Specialized Domains | 12 | fintech-engineer, blockchain-developer, iot-engineer |
| Business & Product | 11 | product-manager, technical-writer, ux-researcher |
| Meta & Orchestration | 10 | multi-agent-coordinator, workflow-orchestrator |
| Research & Analysis | 7 | competitive-analyst, trend-analyst, research-analyst |
See the subagent catalog for the full catalog with models and descriptions.
Simple bug fix (lean track, ~3-5 min):
/work Fix the off-by-one error in pagination logic in src/api/list.py
Complex feature (rigorous track with design review, ~10-30 min):
/work Add rate limiting middleware to the Express API with Redis backing
Invoke a specialist subagent:
/subagent invoke rust-engineer Fix the lifetime issues in src/parser/mod.rs
Parallel security audit:
Spawn security-auditor AND penetration-tester subagents in parallel
to audit the payment processing module in src/payments/
Plan without coding:
/plan-only Migrate the monolithic API to microservices with gRPC
Workflow shortcuts (same pipeline, familiar command names):
/brainstorm Design the event-sourcing layer for order history
/write-plan
/execute-plan
See examples for detailed walkthroughs.
Hypervisor (Claude Code + CLAUDE.md policy)
├── Core Pipeline Agents
│ ├── developer-agent.md -- Implementation specialist
│ ├── git-operations-agent.md -- Branch management, remote sync
│ ├── pr-manager-agent.md -- PR creation and management
│ └── panel/ -- Design review panel (parallel)
│ ├── architect-reviewer.md
│ ├── security-reviewer.md
│ └── adversarial-reviewer.md
│
└── Specialized Subagents (135+ agents, 10 categories)
├── core-development/
├── language-specialists/
├── infrastructure/
├── ...
└── research-analysis/
- Add a subagent: Create a
.mdfile in${CLAUDE_PLUGIN_ROOT}/agents/subagents/<category>/with frontmatter (name,description,tools,model) - Add a phase: Create
.mdin${CLAUDE_PLUGIN_ROOT}/phases/, add the state toCLAUDE.md(diagram, Required Artifacts, transitions), and update${CLAUDE_PLUGIN_ROOT}/state-machine.jsonso it matches the fenced transition block (npm testincludesstate-machine-json). - Add a core agent: Create
.mdin${CLAUDE_PLUGIN_ROOT}/agents/, reference inCLAUDE.md - Adjust budgets: Edit
CLAUDE.mdBudgets section and${CLAUDE_PLUGIN_ROOT}/templates/state.json - Inspect work folders: From the pack/repo root,
npm run summarize-work(ornode scripts/summarize-work.js --json) - Local dashboard (filters, export, metrics):
npm run swe-dashboard -- --cwd /path/to/your/repoor/swe-dashboardin Claude Code (commands/swe-dashboard.md). Sample rows:npm run seed-dashboard-demo(then refresh the dashboard). - Migrate old work state:
node scripts/migrate-work-state.jsthennode scripts/migrate-work-state.js --applyafter major upgrades (seeCHANGELOG.md)
agentic-swe runs the same markdown pipeline — driven by the Hypervisor session per CLAUDE.md — across multiple AI coding platforms:
| Platform | Install Method | Details |
|---|---|---|
| Claude Code | Plugin marketplace + /plugin install (or claude --plugin-dir for dev) |
Primary platform. See Claude Code plugin. |
| Cursor | Plugin via .cursor-plugin/ |
curl -fsSL https://raw.githubusercontent.com/agentic-swe/agentic-swe/main/scripts/install-cursor-plugin.sh | bash then reload; add AGENTIC_SWE_TARGET_REPO=/path/to/app on the same line to auto-merge CLAUDE.md (needs node). Cursor plugin. |
| Codex | Clone + symlink or copy | See .codex/INSTALL.md and the Codex doc in agentic-swe-site. |
| OpenCode | Plugin via .opencode/ |
ESM plugin injects orchestration policy. See the OpenCode doc in agentic-swe-site. |
| Gemini CLI | Extension via gemini-extension.json |
Context loaded from GEMINI.md. |
All platforms share the same markdown source at this repo’s plugin root (commands/, phases/, agents/, …). Platform-specific tool mappings are in ${CLAUDE_PLUGIN_ROOT}/references/ (codex-tools.md, opencode-tools.md, gemini-tools.md, copilot-tools.md).
Skill-like triggering: agentic-swe does not use a separate Skill-tool registry. The same habit is implemented with session hooks (hooks/hooks.json for Claude Code, hooks/hooks-cursor.json for Cursor) running hooks/session-start (memory prime appended by default; opt out AGENTIC_SWE_MEMORY_PRIME=0), plus ${CLAUDE_PLUGIN_ROOT}/references/implicit-routing.md for intent → command/phase hints. The pipeline remains authoritative in root CLAUDE.md. Durable memory (local index, memory-prime, import, sliding summary, optional embeddings): docs site · spec.
.github/workflows/ci.yml in this repo runs on push / pull request to main, on merge queue, and manually (workflow_dispatch). It uses Node 20 and 22, npm ci for the root pack and agents/plugin-runtime/brainstorm-server, then npm run verify, npm run version:check, optional claude plugin validate, and npm test (state machine, references, multi-platform wiring, brainstorm-server, etc.). The docs site has its own CI in agentic-swe-site.
Locally, run npm run ci at this repo root for the same bar as Actions here (minus the Node matrix and unless claude is on your PATH). See the Release checklist for the full maintainer sequence (includes the separate site repo checks).
Built on research from SWE-agent, Agentless, Ambig-SWE, Reflexion, Self-Refine, AgentCoder, TALE, OpenHands, and more. See the Research Basis section in CLAUDE.md for the full citation table.
MIT. See licensing for how the license applies to the pack and typical use (not legal advice).