Docker-based testing environment for Spiderweb. This creates a clean, disposable Debian container for testing the install script and Spiderweb functionality without affecting your main server.
# Build and start the test environment
docker-compose up --build
# Or run in detached mode
docker-compose up -d --build
# Enter the running container
docker exec -it spiderweb-test bash
# Inside the container, run the install script
./install.sh# Build the image
docker build -t spiderweb-test .
# Run interactively
docker run -it --rm --name spiderweb-test spiderweb-test
# Run with API key from environment (for automated testing)
docker run -it --rm \
-e SPIDERWEB_PROVIDER=openai \
-e SPIDERWEB_MODEL=gpt-4o-mini \
-e SPIDERWEB_API_KEY=sk-xxx \
spiderweb-test
# Run with port forwarding (to test from host)
docker run -it --rm \
-p 18790:18790 \
--name spiderweb-test \
spiderweb-test# Test the full interactive install
curl -fsSL https://raw.githubusercontent.com/DeanoC/Spiderweb/main/install.sh | bash
# Or drive the default non-interactive path explicitly.
# On Linux x86_64, auto now prefers the latest published GitHub release.
SPIDERWEB_NON_INTERACTIVE=1 \
SPIDERWEB_INSTALL_ZSS=0 \
SPIDERWEB_INSTALL_SYSTEMD=0 \
SPIDERWEB_START_AFTER_INSTALL=0 \
bash ./install.sh
# Force a local source build instead of the default release install
SPIDERWEB_NON_INTERACTIVE=1 \
SPIDERWEB_INSTALL_SOURCE=source \
SPIDERWEB_INSTALL_ZSS=0 \
SPIDERWEB_INSTALL_SYSTEMD=0 \
SPIDERWEB_START_AFTER_INSTALL=0 \
bash ./install.sh
# Release-binary path
SPIDERWEB_NON_INTERACTIVE=1 \
SPIDERWEB_INSTALL_SOURCE=release \
SPIDERWEB_RELEASE_ARCHIVE_URL=https://github.com/DeanoC/Spiderweb/releases/download/vX.Y.Z/spiderweb-linux-x86_64.tar.gz \
SPIDERWEB_RELEASE_ARCHIVE_SHA256=<sha256> \
SPIDERWEB_INSTALL_ZSS=0 \
SPIDERWEB_INSTALL_SYSTEMD=0 \
SPIDERWEB_START_AFTER_INSTALL=0 \
bash ./install.shThis harness documents and exercises the Linux-first external Codex operator path:
- installer-first host flow (
./install.shon the Spiderweb host) - generic
dev-template workspace baseline that can outlive any one agent session - isolated Spiderweb runtime root plus a clean standalone local workspace node
- standalone
spiderweb-fs-nodeas the remote filesystem node under test - namespace mount via
spiderweb-fs-mount --namespace-url ... - plain Codex launch in live or manual-handoff mode
- agent-driven in-workspace bootstrap, validation, and report artifact capture
The harness assumes one Spiderweb-owned mount model across macOS, Linux, and
Windows: workers start in the mounted workspace directory, and .spiderweb is a
server-projected part of that same namespace rather than a client overlay or
direct endpoint shortcut.
Mounted namespace paths used by the harness:
- local writable project tree:
/nodes/local/fs - remote shared seed data:
/shared_data - workspace metadata:
/projects/<workspace_id>/meta/* - namespace metadata:
/meta/* - generic workspace services:
/services/*
Run the Linux harness directly from the repo root:
bash test-env/test-external-codex-workspace.shFor a faster smoke iteration that still exercises bootstrap, mount, external Codex launch, and mounted writes, use the light scenario:
bash test-env/test-external-codex-workspace-light.shOn macOS with OrbStack installed, use the Orb wrapper so the exact same Linux harness runs inside Orb:
bash test-env/test-external-codex-workspace-orb.shLight Orb smoke variant:
bash test-env/test-external-codex-workspace-orb-light.shUse ORB_MACHINE=<name> or ORB_USER=<user> if you need a non-default Orb
target.
For the native macOS mount path, use the dedicated native harness after the
current Spiderweb app build has been installed with
spiderweb-config config install-fs-extension:
bash test-env/test-external-codex-workspace-macos.shLight native macOS smoke variant:
bash test-env/test-external-codex-workspace-macos-light.shOr through make:
cd test-env && make test-external-codex-workspace
cd test-env && make test-external-codex-workspace-lightRepeatability runner:
cd test-env && make test-external-codex-repeatabilityCompatibility matrix runner:
cd test-env && make test-external-codex-cli-matrixRepro bundle packager:
cd test-env && make package-external-codex-reproCodex launch controls:
CODEX_MODE=auto: try a live Codex launch, then fall back to the dedicated handoff package if the launcher is unavailable or the live step cannot proceedCODEX_MODE=live: require a real Codex launch; launch failure fails the harnessCODEX_MODE=manual: skip live launch and prepare the manual handoff package onlyCODEX_BIN: override the detected Codex binaryCODEX_CLI_VERSION: pinned plain Codex CLI version the harness expects. Default:0.111.0CODEX_AUTH_MODE=auto|api_key|existing_login: choose isolated API-key auth or an existing login.autoprefers API-key auth whenOPENAI_API_KEYis setCODEX_API_KEY_ENV: environment variable name to read forapi_keymode. Default:OPENAI_API_KEYCODEX_LAUNCH_CMD: override the detected launcher when the defaultcodex exectemplate is not correct for the machineCODEX_TIMEOUT_SECONDS: maximum seconds to allow the live Codex phase before the harness fails with a diagnostic handoff/report. Default:900CODEX_IDLE_TIMEOUT_SECONDS: optional idle cutoff for the live Codex phase. Default:0(disabled), becausecodex exec --jsoncan spend long periods silently reasoning before the next visible tool or file event.CODEX_JSON_EVENTS=1: inject--jsoninto commoncodex execlaunch templates and preserve the raw Codex event stream inlogs/codex.stdout.logCODEX_USE_PTY=1: wrap the live Codex launch inscript(1)so the run behaves like a real terminal session and preserveslogs/codex.pty.logCODEX_DISABLE_COLLABORATION_MODES=1: inject--disable collaboration_modesinto commoncodex exectemplates unless disabledCODEX_DISABLE_APPS=1: inject--disable appsby default because the current live Spiderweb path is more reliable without the apps surface in non-interactiveexecCODEX_DISABLE_SHELL_SNAPSHOT=1: inject--disable shell_snapshotby default because the current live Spiderweb path is more reliable without shell snapshotting in non-interactiveexecCODEX_ALLOW_HOST_CODEX_HOME=1: temporarily allow writes under host~/.codexfor reliability while still reporting them as acodex_homemachine-independence gapSPIDERWEB_INSTALL_SOURCE=auto|source|release: choose whether the harness compiles Spiderweb locally or installs from a prebuilt archive. Default:auto, which delegates toinstall.shdefaults and retries withsourceif the selected release path cannot provide the current harness binary setSPIDERWEB_RELEASE_ARCHIVE_URL: release asset URL to use whenSPIDERWEB_INSTALL_SOURCE=release. Default: unsetSPIDERWEB_RELEASE_ARCHIVE_SHA256: optional checksum for the release archiveSPIDERWEB_RELEASE_VERSION: label recorded in installer output for the chosen release build. Default: unset
Current note:
- The standalone installer now defaults to the latest published GitHub release on supported Linux machines to avoid unnecessary rebuilds for normal users.
- The external Codex harness now follows the installer defaults and automatically retries with
sourcewhen a selected release path is missing binaries that the current checkout expects.
Expected output artifacts:
codex_exec_summary.jsoncodex_usage_report.jsoncodex_usage_report.mdbootstrap_provenance.jsoncodex_progress_timeline.jsongame_validation.jsoncodex_handoff/
Repeatability artifacts:
repeatability_summary.jsonrepeatability_summary.md- one subdirectory per run, each containing the normal live harness artifacts
Repeatability interruption behavior:
-
if you intentionally stop the repeatability runner mid-batch, it now writes partial
repeatability_summary.jsonandrepeatability_summary.mdfiles from whatever artifacts already exist -
interrupted runs are marked with
interrupted=trueplus aninterrupt_reason, so you can still see whether the run had already reached bootstrap, validation, or report generation Matrix runner artifacts: -
matrix_summary.json -
matrix_summary.md -
one subdirectory per case, each containing the normal live harness artifacts
Repro bundle artifacts:
README.mdBUG_REPORT.mdrepro_manifest.jsonsource_summaries/cases/- optional
*.tar.gzbundle
Usage report result semantics:
reliability_ok: true only when the run stayed inside the mounted workspace plus harness-owned runtime roots, plus any explicit temporary host-write allowlistsworkspace_bootstrap_ok: true only when the attached agent read the bootstrap metadata and performed the required in-workspace bootstrap actionsmachine_independence_ok: true only when no host-runtime gaps were observedworkspace_bound_services: services bound under/services/*for the mounted workspacenamespace_visible_services: services visible somewhere in the namespace, even if not workspace-bound under/services/*external_prereqs_observed: declared external prerequisites observed during the run, such as the operator-installed Codex runtimecandidate_venom_gaps: inferred local-runtime gaps such ascodex_home,terminal_runtime,git_runtime, andsearch_code_bridge
Fallback behavior:
autodoes not silently skip the Codex step- the harness should still preserve the namespace-mounted workspace context
codex_handoff/is the dedicated resume package for manual continuationcodex_exec_summary.jsoncaptures the last observed Codex event, last completed item, and inferred stall stage from the live--jsonevent streamcodex_progress_timeline.jsonrecords the observed timing of live-run milestones such as Codex launch, bootstrap completion, first workspace write, and validation start- validation and usage reports should still be written in fallback/manual mode
Operator notes:
- prefer the installer-first Linux path for this harness; use
./install-fs-mount.shonly when the namespace mount happens on a separate Linux machine - the harness is about the standalone node + namespace story, not the older routed
--workspace-urlonly flow - the mounted workspace directory exposed at
nodes/local/fsis the canonical external-agent entrypoint - the clean writable project tree is
nodes/local/fs; Spiderweb’s own runtime root is kept separate from that workspace on purpose - the harness creates only a generic
dev-template workspace baseline; after attach, Spiderweb must surface a real workspace-rootAGENTS.md, and the external agent is responsible for reading that file first and then following the workspace-local./.spiderweb/*bootstrap projection from inside the workspace AGENTS.mdis the human-facing workspace contract;./.spiderweb/agent_bootstrap.jsonand./.spiderweb/agent_bootstrap_quickref.jsonare the exact machine-readable bootstrap surface for discovery order, preferred./.spiderweb/services/*usage, self-home provisioning, service verification/repair, and persistence semantics- the expected interactive user flow is: start
codexin the mounted workspace directory, give a short prompt that tells it to readAGENTS.md, and let it work relative to that directory - shared workspace binds persist across agent detach/reattach, while worker-private loopback state is expected to be ephemeral
CODEX_AUTH_MODE=api_keyis still the strict fresh-install path, butexisting_loginis temporarily acceptable for reliability because host~/.codexwrites are allowlisted by default while still reported as acodex_homemachine-independence gapCODEX_LAUNCH_CMDis optional; the harness can build a default launcher around the pinnedcodex execflow- the default live launcher now preserves both
logs/codex.stdout.logandlogs/codex.pty.log, which makes it much easier to distinguish “still progressing” from “stopped after a tool result” test-env/test-external-codex-cli-matrix.shis the fast way to compare pinned Codex CLI versions and PTY/JSON launch modes against the same Spiderweb scenariotest-env/test-external-codex-repeatability.shis the fast way to prove the newworkspace_bootstrap_okmilestone stays green across multiple live runs on the same machinetest-env/package-external-codex-repro.shcollects the matrix outputs into a single upstream-ready repro pack with a generated bug report- custom launch templates may use
{codex_bin},{workspace_root},{namespace_root},{namespace_meta_dir},{workspace_meta_dir},{shared_data_dir},{prompt_file}, and{artifact_dir} - the default artifact directory is now outside the repo checkout so the harness does not create false host-repo leakage by itself
- the current milestone is
workspace_bootstrap_ok; plain Codex still cannot fully clearcodex_home,terminal_runtime, andgit_runtimeunder the no-launch-hook rule, somachine_independence_okremains the follow-on milestone - if you still want to override the launcher, a working template is:
CODEX_MODE=live \
CODEX_AUTH_MODE=api_key \
OPENAI_API_KEY=... \
CODEX_LAUNCH_CMD='cat {prompt_file} | {codex_bin} exec --skip-git-repo-check --dangerously-bypass-approvals-and-sandbox --ephemeral --add-dir {namespace_meta_dir} --add-dir {workspace_meta_dir} --add-dir {shared_data_dir} --add-dir {artifact_dir} -C {workspace_root} -o {artifact_dir}/codex_last_message.txt -' \
bash test-env/test-external-codex-workspace.shNative macOS runs default to TRACE_BACKEND=none because the Linux strace
path is not available there; the bootstrap and usage report still infer required
reads from the Codex event log.
This repo also includes a local CI-style integration test for the embeddable filesystem + health services example.
# Run directly
bash test-env/test-embed-multi-service.sh
# Or through make
cd test-env && make test-embed-multi-serviceWhat it validates:
- boots
embed-multi-service-nodewith a temporary export - probes
/fsviaspiderweb-fs-mount(readdir+cat) - probes
/v1/healthwith a raw WebSocket handshake and validatesok: true
Useful env vars:
PORT(default21910)BIND_ADDR(default127.0.0.1)SKIP_BUILD=1to skipzig buildif binaries are already built
This test exercises the control-plane + mount integration flow end-to-end:
- starts
spiderweb - starts two
embed-multi-service-nodefilesystem nodes - negotiates
control.version(spiderweb-control) then runscontrol.node_invite_create,control.node_join,control.workspace_create,control.workspace_mount_set, andcontrol.workspace_activatewith workspace mutation auth (workspace_token) - restarts
spiderweband verifies control-plane state is recovered from persisted LTM snapshot - updates mounts live (
/src->/live) and validates the mount client converges to the new path - mounts both nodes at the same project mount path (
/src) as a failover group - verifies reads initially come from node A/B, kills the active node, and verifies failover
- restarts the stopped node, rejoins/remounts it, then kills the surviving node to verify second failover convergence
Additional focused scenarios:
test-distributed-workspace-bootstrap.sh: validatescontrol.workspace_upbootstrap output and workspace desired/actual/drift schema.test-distributed-workspace-drift.sh: forces a desired/actual mismatch and verifies drift + reconcile diagnostics.test-distributed-workspace-matrix.sh: runs failover/reconnect/bootstrap/drift as one matrix entrypoint.
# Run directly
bash test-env/test-distributed-workspace.sh
# Or through make
cd test-env && make test-distributed-workspace
cd test-env && make test-distributed-workspace-bootstrap
cd test-env && make test-distributed-workspace-drift
cd test-env && make test-distributed-workspace-matrix
cd test-env && make test-distributed-workspace-encrypted
cd test-env && make test-distributed-workspace-operator-token
cd test-env && make test-distributed-soak-chaos
cd test-env && make test-spiderweb-control-protocolUseful env vars:
SPIDERWEB_PORT(default28790)NODE1_PORT(default28911)NODE2_PORT(default28912)BIND_ADDR(default127.0.0.1)SPIDERWEB_CONTROL_OPERATOR_TOKEN(optional; includeoperator_tokenin protected mutations if enabled)SPIDERWEB_CONTROL_STATE_KEY_HEX(optional; enables encrypted control-plane snapshot storage)ASSERT_OPERATOR_TOKEN_GATE=1(optional; assert mutation deny/allow behavior before the main workflow)SPIDERWEB_METRICS_PORT(optional; enables HTTP/livez,/readyz,/metrics(Prometheus),/metrics.json(JSON))SKIP_BUILD=1to skipzig buildif binaries are already built
Validates protocol-level contract points used in release checks:
- control negotiation order (
control.version->control.connect) - runtime Acheron negotiation order (
acheron.t_version->acheron.t_attach) - standalone FS routing order (
acheron.t_fs_hellomust come first) - standalone FS HELLO auth-token enforcement (
--auth-token) - source-level envelope/type guard in core client code paths
# Run directly
bash test-env/test-spiderweb-control-protocol.sh
# Or through make
cd test-env && make test-spiderweb-control-protocolUseful env vars:
SPIDERWEB_PORT(default28794)FS_NODE_PORT(default28931)BIND_ADDR(default127.0.0.1)SKIP_BUILD=1to skipzig buildif binaries are already built
Runs the distributed workspace flow repeatedly with randomized ports and optional auth/encryption modes.
# Run directly
bash test-env/test-distributed-soak-chaos.sh
# Or through make
cd test-env && make test-distributed-soak-chaosUseful env vars:
SOAK_ITERATIONS(default10)SOAK_ENABLE_OPERATOR_MODE=0|1(default1)SOAK_ENABLE_ENCRYPTED_MODE=0|1(default1)
# Stop and remove container (data is lost - this is the point!)
docker-compose down
# Remove the image to force rebuild
docker-compose down --rmi local
# Start fresh
docker-compose up --buildDockerfile- Minimal Debian with dependencies pre-installeddocker-compose.yml- Container orchestrationtest-install.sh- Automated test of the install script