Skip to content

aaronlifton/orchestrated-codex

Repository files navigation

orchestrated-codex

scr1-4

Architecture

Quick start

Install deps:

bun install

Build the launcher and run the full squad from anywhere:

bun run build:all

That will create the ./dist folder with the required native bindings:

dist
├── codex_orchestrator
├── launch_codex_orchestrator_session
└── native
    └── turso.darwin-arm64.node

Then, run the orchestrator from a new or existing project folder:

../bin/launch_codex_orchestrator_session --prompt "Ship the MVP"

You can also run the TypeScript entry point during development:

bun run src/launch_codex_orchestrator_session.ts --prompt "Ship the MVP"

The launcher generates the Zellij layout, starts the orchestrator in the background, and wires up every log pane automatically, so nothing needs to be typed inside the session.

Workflow

Each core agent runs in a two-turn sequence: a plan turn first (internal outline/approach), followed by a follow-up turn that produces the final, user-facing response. The follow-up turn is still on the same thread, so it can reference the plan while enforcing formatting or concise-mode rules.

The researcher reads the planner output, gathers supporting context, and returns a structured research summary. The builder then uses the planner + researcher outputs to emit an implementation blueprint (JSON), which the orchestrator parses into dynamic agents. Those dynamic agents can execute sequentially (when dependsOn creates a chain) or in parallel (when their dependencies are independent). Use --parallel or set group-level parallel = True in configs/agents.lark to encourage the builder to emit four independent specialists for concurrent execution.

Agent configuration

Agentlark examples

Example usage:

./dist/launch_codex_orchestrator_session --agents-lark configs/agents.smoke.lark
scr1-4

Core agent definitions now live in configs/agents.lark. The Rust agentlark tool compiles that file into configs/agents.out.json, which is what the orchestrator reads. Each entry defines the agent id, role, plan prompt, follow-up prompt, optional follow-up schema (as JSON-ish Starlark), optional follow-up schema rules, and whether the follow-up turn should run. Set follow_up_enabled = False to skip the second turn for a given agent.

To override the config path, set ORCHESTRATED_CODEX_AGENTS_CONFIG=/path/to/agents.out.json or pass --agents-config /path/to/agents.out.json. If the file is missing or invalid, the orchestrator falls back to the built-in defaults.

Orchestration configuration

configs/agents.lark can also include per-group orchestration settings to define autonomous/parallel/relay behavior and explicit relay handoffs. These settings override the equivalent CLI flags (--autonomous, --relay, --parallel).

groups = [
  group(
    id = "core",
    orchestration = define_orchestration(
      autonomous = True,
      parallel = False,
      max_waves = 3,
      relay = define_relay(
        enabled = True,
        require_handoff_note = True,
        graph = {
          "planner": ["researcher"],
          "researcher": ["builder"],
          "builder": [],
        },
      ),
    ),
    agents = [
      agent(id = "planner", role = "Planner"),
      agent(id = "researcher", role = "Researcher", depends_on = ["planner"]),
      agent(id = "builder", role = "Builder", depends_on = ["planner", "researcher"]),
    ],
  ),
  group(
    id = "delivery",
    orchestration = define_orchestration(
      autonomous = False,
      parallel = True,
      max_waves = 1,
    ),
    agents = [
      agent(id = "qa", role = "QA", depends_on = ["builder"]),
      agent(id = "docs", role = "Docs", depends_on = ["builder"]),
    ],
  ),
]
  • max_waves limits how many autonomous waves the orchestrator will generate for that group.
  • relay.graph is an adjacency list of core agent ids that the relay orchestrator should honor when handing off work.
  • You can define multiple groups; the orchestrator will run each group sequentially in the order they appear in configs/agents.out.json.

Common flags

Need the builders to keep collaborating without you?

Add --autonomous to let a dedicated orchestrator review the most recent builder wave, emit fresh JSON blueprints, and launch additional builder-only agents automatically. That mode creates logs/autonomous-orchestrator.log, mirrors it into the layout, and keeps looping until the orchestrator runs out of work. You can also set this in configs/agents.lark (see below).

Want the agents to work in parallel?

Use --parallel when you want the Builder to emit four independent specialists that can run concurrently; the first dynamic wave is then expected to keep all four slots busy without dependency chains. You can also set this in configs/agents.lark under your group orchestration.

Want even tighter back-and-forth handoffs?

Pass --relay instead; it implies --autonomous but tells the orchestrator to treat every new wave as a relay baton, explicitly describing the message the outgoing agent is handing off and who receives it next. Relay mode keeps generating builder waves (up to five in a cascade) and tags everything with relay so you can trace the ongoing dialogue in the memory/ store. You can also set this in configs/agents.lark under your group orchestration.

Trying to preserve your Codex quota?

Pass --mini to force every agent turn onto codex-mini-latest, which trims model usage while still running the full workflow.

Need visibility into usage?

Append --metrics to print per-agent input/output token totals (including dynamic/autonomous waves) after the run finishes. Pair it with --debug-prompt to log the exact character/byte/line counts of every plan and follow-up prompt before Codex sees them.

Want to use a different AI backend?

Pass --backend minimax to switch to MiniMax M2.1 over the Anthropic-compatible API. You can also set ORCHESTRATED_CODEX_BACKEND=minimax. Provide credentials via ORCHESTRATED_CODEX_MINIMAX_API_KEY (or ANTHROPIC_API_KEY) and optionally override the base URL with ORCHESTRATED_CODEX_MINIMAX_BASE_URL (defaults to https://api.minimax.io/anthropic). The default model is MiniMax-M2.1, override with --minimax-model or ORCHESTRATED_CODEX_MINIMAX_MODEL.

Want to sanity-check your agent configuration?

Add --concise when you just want to sanity-check the core agents without burning through daily usage. That flag tells single-turn agents to answer with exactly one sentence and skips spawning follow-on specialists. Use --concise-full if you need dynamic waves to run as well.

Want to run only specific orchestration groups?

Pass --groups core,delivery to limit execution to a subset of group ids (default is all groups). Example: bun run src/index.ts --groups core,delivery

About launching

When launching via bin/launch_codex_orchestrator_session (or bun run src/launch_codex_orchestrator_session.ts), pass the same flags (--concise, --concise-full, --metrics, --debug-prompt, etc.). The launcher injects them into the orchestrator command before Zellij boots, so each pane only needs to tail logs. Portable runs default their memory artifacts and AgentFS logs to the directory where you invoked the binary (override with --memory-dir, --agentfs-path, or the CODEX_MEMORY_DIR/AGENTFS_PATH env vars) so rerunning the launcher from another repo never collides with the original development database.

AgentFS dependency

AgentFS is required to store or view an auditability/observability trail. To install:

  1. gh repo clone https://github.com/tursodatabase/agentfs
  2. cd ./path/to/agentfs/cli
  3. CARGO_NET_GIT_FETCH_WITH_CLI=true cargo b
  4. cp -i target/debug/agentfs ~/.local/bin/

Launching the Codex squad layout

The zellij.kdl layout the launcher emits defines a core-logs tab for the first four configured agents (tailing their log files). The spawned-agents tab always includes the timeline pane and, if there are more than four configured agents, the next four log panes appear beneath it. Additional agents are grouped in tabs named agent-logs-<n> with up to four panes each. When --autonomous is enabled, a autonomous-orchestrator pane tails its log alongside the agent panes in whatever tab its position lands in. spawnCodexSquadSession() simply ensures those log files exist so the tail -f commands have something to follow. The orchestrator itself runs headless in the background; the timeline pane serves as the overview.

index.ts manages the first three Codex agents sequentially and writes their transcripts to logs/<agent>.log. After the Builder finishes, it emits a JSON blueprint describing up to four new specialists. The orchestrator parses that blueprint and executes the specialists (respecting the dependsOn graph), writing their output to their own log files.

When every dynamic specialist forms a straight dependency chain (agent 2 only depends on agent 1, agent 3 waits on agent 2, and so on), the work can otherwise stall. The orchestrator detects that pattern and asks Codex to synthesize up to four continuation implementers whose prompts explicitly reference the planner/researcher/builder logs plus the dynamic specialists they are executing. Those continuation agents log to their own files, so work keeps moving even when the original wave only contained planners or researchers.

Pass --stream to bin/launch_codex_orchestrator_session (or when invoking the TypeScript launcher) to watch Codex events arrive in real time for every agent turn. If you skip --stream, the orchestrator shows a lightweight spinner during each plan/follow-up turn so you still get progress feedback without the extra event chatter.

Autonomous mode layers an additional orchestrator loop on top of that flow. After the latest builder wave completes, the orchestrator agent reads every relevant log (including logs/autonomous-orchestrator.log), emits a fresh JSON blueprint, and spawns another set of builder-only agents so they can hand work back and forth without waiting on a human prompt. Each wave is persisted to memory with autonomous tags so you can audit how the squad self-directed the project later.

Persistent project memory

  • Every agent turn now writes its plan/follow-up pair to the AgentFS database at memory/artifacts.agentfs.db (stored under /memory/artifacts/<timestamp>__slug.json). You still get the same machine-readable payloads, but now they live inside a portable SQLite-backed filesystem alongside the rest of the agent audit trail.
  • Inspect those artifacts with bun run memory:inspect [--agent builder --type follow-up --limit 5]. Add --content when you need the full text or --json to feed another tool.
  • Run the “sleeptime compute” condensation pass with bun run memory:condense [--days 2 --min-group-size 3] to merge older artifacts into lightweight summaries while keeping pointers (relatedArtifacts) back to the detailed records.
  • Dynamic agents and builder-wave specialists automatically tag themselves (dynamic-blueprint, builder-implementer, etc.) so you can slice the memory store per wave or responsibility when inspecting.
  • Both memory:inspect and memory:condense accept --project-root /path/to/orchestrated-codex (or whichever project directory is holding memory/ artifacts), so you can invoke them from any folder — including via the compiled bin/memory_inspector produced by bun run build:memory-inspector.

AgentFS action log

  • Every orchestrator run now mirrors agent turns and streamed actions into an AgentFS database (default .agentfs/agent-db.db). Each session gets a unique slug under /runs/<session>/ so you can diff or mount prior executions for complete reproducibility.
  • Plan and follow-up prompts/responses are written as JSON files per agent turn, and streaming command/file/tool events are captured under /runs/<session>/events/. Use the AgentFS CLI (agentfs fs ls orchestrated-codex) or SDK to inspect the audit trail.
  • Prefer a quick TypeScript reader? Run bun run scripts/agentfs_reader.ts [--path ./custom.db --session run-20241220 --agent builder --events --limit 2] (or bun run agentfs:read …) to dump the most recent agent turns and optional event traces using the AgentFS SDK directly. By default it points at .agentfs/agent-db.db, matching the launcher’s auto-created file. Omit --events to see only the planner/researcher/builder turn summaries under /runs/<session>/agents/**; add --events to include every streamed command_execution, file_change, or MCP entry from /runs/<session>/events/**.
  • Configure the target database or session name with --agentfs-id <id>, --agentfs-path /path/to/db, and --agentfs-session <slug> (env vars AGENTFS_ID, AGENTFS_PATH, AGENTFS_SESSION provide the same overrides). Use --no-agentfs if you need to disable logging for a run.

One-command launcher

src/launch_codex_orchestrator_session.ts bundles the entire workflow into a single executable script. It accepts CLI flags (e.g., --agents planner,researcher,builder, --session nightly, --prompt "Custom plan", --autonomous, --relay, --mini, --stream) to generate a fresh zellij.kdl in the current directory using convertJsonToKdl() and immediately runs zellij --new-session-with-layout zellij.kdl [--session …]. The launcher now detects (or you can override via --project-root /path/to/orchestrated-codex) the repository root so you can invoke it from any directory without the orchestrator tripping over relative paths. It also mirrors all agent logs into the directory where you run the executable so pane tails always have local files to follow.

To build a self-contained executable you can copy into your $PATH, run:

bun run build:launcher

which emits bin/launch_codex_orchestrator_session. That binary embeds the same CLI behavior, so you can run bin/launch_codex_orchestrator_session --prompt "New plan" from any folder and it will still spawn the Codex orchestration from the detected project root.

Bundled mode needs a small patch for the Turso native bindings: Bun compiles the orchestrator into a virtual filesystem (/$bunfs/root/...), so @tursodatabase/database can no longer discover its .node binding via relative paths or optional dependency resolution. The patch makes the loader respect NAPI_RS_NATIVE_LIBRARY_PATH so the launcher can point to a real .node file on disk when running the bundled binary.

Start the environment from a real terminal using:

zellij --new-session-with-layout zellij.kdl

Zellij will open with the bootstrap pane, the TypeScript script will spawn the squad panes programmatically, and then the Codex conversation continues in each agent pane.

Need to bail out quickly? Zellij already ships a single-chord quit shortcut: press Ctrl+Q once (no mode prefixes required) and the entire session exits immediately. This binding comes from the default config (shared_except "locked" { bind "Ctrl q" { Quit; } }), so you get it automatically without editing any config files.

Need to pause the squad without killing the terminal? Press Ctrl+S anywhere inside the Zellij session. The layout injects a custom keybind that jumps to the core-logs tab, sends the raw Ctrl+S byte, and then leaves you there so you can watch the interrupt summary print. The orchestrator finishes the agent that is already mid-turn, immediately logs a summary in logs/interrupt.log, saves the same text as an interrupt artifact inside the AgentFS store (memory/artifacts.agentfs.db), and stops launching any dynamic/builder/autonomous waves. After hitting the shortcut you can hop back to your previous tab with Tab/GoToTab as usual.

Programmatic Zellij automation

The zellij/ folder exposes a typed API that mirrors the capabilities of the old zellij-mcp-server tools without requiring the MCP transport. Import ZellijClient and work with sessions, panes, pipes, plugins, or layouts directly:

import { ZellijClient } from "zellij-toolkit";

const zellij = new ZellijClient();

await zellij.sessions.create("demo");
await zellij.panes.create({ direction: "right", command: "htop" });
await zellij.pipes.pipe("hello from codex", { name: "broadcast" });
await zellij.plugins.launch({ url: "zellij:plugin:tab-bar" });

Each manager validates inputs, wraps the zellij CLI, and returns typed results instead of ToolResponse payloads so you can embed the functionality in regular scripts or services.


This project was created using bun init in bun v1.3.3. Bun is a fast all-in-one JavaScript runtime.

Copyright 2024 Aaron Lifton

About

a multi-agent orchestration framework that autonomously decomposes complex prompts into actionable tasks through a specialized hierarchy of user-definable agents, orchestrating them across Zellij terminal panes for concurrent, observable execution

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors