Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Spiderweb Test Environment

Docker-based testing environment for Spiderweb. This creates a clean, disposable Debian container for testing the install script and Spiderweb functionality without affecting your main server.

Quick Start

# Build and start the test environment
docker-compose up --build

# Or run in detached mode
docker-compose up -d --build

# Enter the running container
docker exec -it spiderweb-test bash

# Inside the container, run the install script
./install.sh

Manual Testing

# Build the image
docker build -t spiderweb-test .

# Run interactively
docker run -it --rm --name spiderweb-test spiderweb-test

# Run with API key from environment (for automated testing)
docker run -it --rm \
  -e SPIDERWEB_PROVIDER=openai \
  -e SPIDERWEB_MODEL=gpt-4o-mini \
  -e SPIDERWEB_API_KEY=sk-xxx \
  spiderweb-test

# Run with port forwarding (to test from host)
docker run -it --rm \
  -p 18790:18790 \
  --name spiderweb-test \
  spiderweb-test

Testing the Install Script

# Test the full interactive install
curl -fsSL https://raw.githubusercontent.com/DeanoC/Spiderweb/main/install.sh | bash

# Or drive the default non-interactive path explicitly.
# On Linux x86_64, auto now prefers the latest published GitHub release.
SPIDERWEB_NON_INTERACTIVE=1 \
SPIDERWEB_INSTALL_ZSS=0 \
SPIDERWEB_INSTALL_SYSTEMD=0 \
SPIDERWEB_START_AFTER_INSTALL=0 \
bash ./install.sh

# Force a local source build instead of the default release install
SPIDERWEB_NON_INTERACTIVE=1 \
SPIDERWEB_INSTALL_SOURCE=source \
SPIDERWEB_INSTALL_ZSS=0 \
SPIDERWEB_INSTALL_SYSTEMD=0 \
SPIDERWEB_START_AFTER_INSTALL=0 \
bash ./install.sh

# Release-binary path
SPIDERWEB_NON_INTERACTIVE=1 \
SPIDERWEB_INSTALL_SOURCE=release \
SPIDERWEB_RELEASE_ARCHIVE_URL=https://github.com/DeanoC/Spiderweb/releases/download/vX.Y.Z/spiderweb-linux-x86_64.tar.gz \
SPIDERWEB_RELEASE_ARCHIVE_SHA256=<sha256> \
SPIDERWEB_INSTALL_ZSS=0 \
SPIDERWEB_INSTALL_SYSTEMD=0 \
SPIDERWEB_START_AFTER_INSTALL=0 \
bash ./install.sh

External Codex Workspace E2E Harness

This harness documents and exercises the Linux-first external Codex operator path:

  • installer-first host flow (./install.sh on the Spiderweb host)
  • generic dev-template workspace baseline that can outlive any one agent session
  • isolated Spiderweb runtime root plus a clean standalone local workspace node
  • standalone spiderweb-fs-node as the remote filesystem node under test
  • namespace mount via spiderweb-fs-mount --namespace-url ...
  • plain Codex launch in live or manual-handoff mode
  • agent-driven in-workspace bootstrap, validation, and report artifact capture

The harness assumes one Spiderweb-owned mount model across macOS, Linux, and Windows: workers start in the mounted workspace directory, and .spiderweb is a server-projected part of that same namespace rather than a client overlay or direct endpoint shortcut.

Mounted namespace paths used by the harness:

  • local writable project tree: /nodes/local/fs
  • remote shared seed data: /shared_data
  • workspace metadata: /projects/<workspace_id>/meta/*
  • namespace metadata: /meta/*
  • generic workspace services: /services/*

Run the Linux harness directly from the repo root:

bash test-env/test-external-codex-workspace.sh

For a faster smoke iteration that still exercises bootstrap, mount, external Codex launch, and mounted writes, use the light scenario:

bash test-env/test-external-codex-workspace-light.sh

On macOS with OrbStack installed, use the Orb wrapper so the exact same Linux harness runs inside Orb:

bash test-env/test-external-codex-workspace-orb.sh

Light Orb smoke variant:

bash test-env/test-external-codex-workspace-orb-light.sh

Use ORB_MACHINE=<name> or ORB_USER=<user> if you need a non-default Orb target.

For the native macOS mount path, use the dedicated native harness after the current Spiderweb app build has been installed with spiderweb-config config install-fs-extension:

bash test-env/test-external-codex-workspace-macos.sh

Light native macOS smoke variant:

bash test-env/test-external-codex-workspace-macos-light.sh

Or through make:

cd test-env && make test-external-codex-workspace
cd test-env && make test-external-codex-workspace-light

Repeatability runner:

cd test-env && make test-external-codex-repeatability

Compatibility matrix runner:

cd test-env && make test-external-codex-cli-matrix

Repro bundle packager:

cd test-env && make package-external-codex-repro

Codex launch controls:

  • CODEX_MODE=auto: try a live Codex launch, then fall back to the dedicated handoff package if the launcher is unavailable or the live step cannot proceed
  • CODEX_MODE=live: require a real Codex launch; launch failure fails the harness
  • CODEX_MODE=manual: skip live launch and prepare the manual handoff package only
  • CODEX_BIN: override the detected Codex binary
  • CODEX_CLI_VERSION: pinned plain Codex CLI version the harness expects. Default: 0.111.0
  • CODEX_AUTH_MODE=auto|api_key|existing_login: choose isolated API-key auth or an existing login. auto prefers API-key auth when OPENAI_API_KEY is set
  • CODEX_API_KEY_ENV: environment variable name to read for api_key mode. Default: OPENAI_API_KEY
  • CODEX_LAUNCH_CMD: override the detected launcher when the default codex exec template is not correct for the machine
  • CODEX_TIMEOUT_SECONDS: maximum seconds to allow the live Codex phase before the harness fails with a diagnostic handoff/report. Default: 900
  • CODEX_IDLE_TIMEOUT_SECONDS: optional idle cutoff for the live Codex phase. Default: 0 (disabled), because codex exec --json can spend long periods silently reasoning before the next visible tool or file event.
  • CODEX_JSON_EVENTS=1: inject --json into common codex exec launch templates and preserve the raw Codex event stream in logs/codex.stdout.log
  • CODEX_USE_PTY=1: wrap the live Codex launch in script(1) so the run behaves like a real terminal session and preserves logs/codex.pty.log
  • CODEX_DISABLE_COLLABORATION_MODES=1: inject --disable collaboration_modes into common codex exec templates unless disabled
  • CODEX_DISABLE_APPS=1: inject --disable apps by default because the current live Spiderweb path is more reliable without the apps surface in non-interactive exec
  • CODEX_DISABLE_SHELL_SNAPSHOT=1: inject --disable shell_snapshot by default because the current live Spiderweb path is more reliable without shell snapshotting in non-interactive exec
  • CODEX_ALLOW_HOST_CODEX_HOME=1: temporarily allow writes under host ~/.codex for reliability while still reporting them as a codex_home machine-independence gap
  • SPIDERWEB_INSTALL_SOURCE=auto|source|release: choose whether the harness compiles Spiderweb locally or installs from a prebuilt archive. Default: auto, which delegates to install.sh defaults and retries with source if the selected release path cannot provide the current harness binary set
  • SPIDERWEB_RELEASE_ARCHIVE_URL: release asset URL to use when SPIDERWEB_INSTALL_SOURCE=release. Default: unset
  • SPIDERWEB_RELEASE_ARCHIVE_SHA256: optional checksum for the release archive
  • SPIDERWEB_RELEASE_VERSION: label recorded in installer output for the chosen release build. Default: unset

Current note:

  • The standalone installer now defaults to the latest published GitHub release on supported Linux machines to avoid unnecessary rebuilds for normal users.
  • The external Codex harness now follows the installer defaults and automatically retries with source when a selected release path is missing binaries that the current checkout expects.

Expected output artifacts:

  • codex_exec_summary.json
  • codex_usage_report.json
  • codex_usage_report.md
  • bootstrap_provenance.json
  • codex_progress_timeline.json
  • game_validation.json
  • codex_handoff/

Repeatability artifacts:

  • repeatability_summary.json
  • repeatability_summary.md
  • one subdirectory per run, each containing the normal live harness artifacts

Repeatability interruption behavior:

  • if you intentionally stop the repeatability runner mid-batch, it now writes partial repeatability_summary.json and repeatability_summary.md files from whatever artifacts already exist

  • interrupted runs are marked with interrupted=true plus an interrupt_reason, so you can still see whether the run had already reached bootstrap, validation, or report generation Matrix runner artifacts:

  • matrix_summary.json

  • matrix_summary.md

  • one subdirectory per case, each containing the normal live harness artifacts

Repro bundle artifacts:

  • README.md
  • BUG_REPORT.md
  • repro_manifest.json
  • source_summaries/
  • cases/
  • optional *.tar.gz bundle

Usage report result semantics:

  • reliability_ok: true only when the run stayed inside the mounted workspace plus harness-owned runtime roots, plus any explicit temporary host-write allowlists
  • workspace_bootstrap_ok: true only when the attached agent read the bootstrap metadata and performed the required in-workspace bootstrap actions
  • machine_independence_ok: true only when no host-runtime gaps were observed
  • workspace_bound_services: services bound under /services/* for the mounted workspace
  • namespace_visible_services: services visible somewhere in the namespace, even if not workspace-bound under /services/*
  • external_prereqs_observed: declared external prerequisites observed during the run, such as the operator-installed Codex runtime
  • candidate_venom_gaps: inferred local-runtime gaps such as codex_home, terminal_runtime, git_runtime, and search_code_bridge

Fallback behavior:

  • auto does not silently skip the Codex step
  • the harness should still preserve the namespace-mounted workspace context
  • codex_handoff/ is the dedicated resume package for manual continuation
  • codex_exec_summary.json captures the last observed Codex event, last completed item, and inferred stall stage from the live --json event stream
  • codex_progress_timeline.json records the observed timing of live-run milestones such as Codex launch, bootstrap completion, first workspace write, and validation start
  • validation and usage reports should still be written in fallback/manual mode

Operator notes:

  • prefer the installer-first Linux path for this harness; use ./install-fs-mount.sh only when the namespace mount happens on a separate Linux machine
  • the harness is about the standalone node + namespace story, not the older routed --workspace-url only flow
  • the mounted workspace directory exposed at nodes/local/fs is the canonical external-agent entrypoint
  • the clean writable project tree is nodes/local/fs; Spiderweb’s own runtime root is kept separate from that workspace on purpose
  • the harness creates only a generic dev-template workspace baseline; after attach, Spiderweb must surface a real workspace-root AGENTS.md, and the external agent is responsible for reading that file first and then following the workspace-local ./.spiderweb/* bootstrap projection from inside the workspace
  • AGENTS.md is the human-facing workspace contract; ./.spiderweb/agent_bootstrap.json and ./.spiderweb/agent_bootstrap_quickref.json are the exact machine-readable bootstrap surface for discovery order, preferred ./.spiderweb/services/* usage, self-home provisioning, service verification/repair, and persistence semantics
  • the expected interactive user flow is: start codex in the mounted workspace directory, give a short prompt that tells it to read AGENTS.md, and let it work relative to that directory
  • shared workspace binds persist across agent detach/reattach, while worker-private loopback state is expected to be ephemeral
  • CODEX_AUTH_MODE=api_key is still the strict fresh-install path, but existing_login is temporarily acceptable for reliability because host ~/.codex writes are allowlisted by default while still reported as a codex_home machine-independence gap
  • CODEX_LAUNCH_CMD is optional; the harness can build a default launcher around the pinned codex exec flow
  • the default live launcher now preserves both logs/codex.stdout.log and logs/codex.pty.log, which makes it much easier to distinguish “still progressing” from “stopped after a tool result”
  • test-env/test-external-codex-cli-matrix.sh is the fast way to compare pinned Codex CLI versions and PTY/JSON launch modes against the same Spiderweb scenario
  • test-env/test-external-codex-repeatability.sh is the fast way to prove the new workspace_bootstrap_ok milestone stays green across multiple live runs on the same machine
  • test-env/package-external-codex-repro.sh collects the matrix outputs into a single upstream-ready repro pack with a generated bug report
  • custom launch templates may use {codex_bin}, {workspace_root}, {namespace_root}, {namespace_meta_dir}, {workspace_meta_dir}, {shared_data_dir}, {prompt_file}, and {artifact_dir}
  • the default artifact directory is now outside the repo checkout so the harness does not create false host-repo leakage by itself
  • the current milestone is workspace_bootstrap_ok; plain Codex still cannot fully clear codex_home, terminal_runtime, and git_runtime under the no-launch-hook rule, so machine_independence_ok remains the follow-on milestone
  • if you still want to override the launcher, a working template is:
CODEX_MODE=live \
CODEX_AUTH_MODE=api_key \
OPENAI_API_KEY=... \
CODEX_LAUNCH_CMD='cat {prompt_file} | {codex_bin} exec --skip-git-repo-check --dangerously-bypass-approvals-and-sandbox --ephemeral --add-dir {namespace_meta_dir} --add-dir {workspace_meta_dir} --add-dir {shared_data_dir} --add-dir {artifact_dir} -C {workspace_root} -o {artifact_dir}/codex_last_message.txt -' \
bash test-env/test-external-codex-workspace.sh

Native macOS runs default to TRACE_BACKEND=none because the Linux strace path is not available there; the bootstrap and usage report still infer required reads from the Codex event log.

Embedded Multi-Service Integration Test

This repo also includes a local CI-style integration test for the embeddable filesystem + health services example.

# Run directly
bash test-env/test-embed-multi-service.sh

# Or through make
cd test-env && make test-embed-multi-service

What it validates:

  • boots embed-multi-service-node with a temporary export
  • probes /fs via spiderweb-fs-mount (readdir + cat)
  • probes /v1/health with a raw WebSocket handshake and validates ok: true

Useful env vars:

  • PORT (default 21910)
  • BIND_ADDR (default 127.0.0.1)
  • SKIP_BUILD=1 to skip zig build if binaries are already built

Distributed Workspace Failover Test

This test exercises the control-plane + mount integration flow end-to-end:

  • starts spiderweb
  • starts two embed-multi-service-node filesystem nodes
  • negotiates control.version (spiderweb-control) then runs control.node_invite_create, control.node_join, control.workspace_create, control.workspace_mount_set, and control.workspace_activate with workspace mutation auth (workspace_token)
  • restarts spiderweb and verifies control-plane state is recovered from persisted LTM snapshot
  • updates mounts live (/src -> /live) and validates the mount client converges to the new path
  • mounts both nodes at the same project mount path (/src) as a failover group
  • verifies reads initially come from node A/B, kills the active node, and verifies failover
  • restarts the stopped node, rejoins/remounts it, then kills the surviving node to verify second failover convergence

Additional focused scenarios:

  • test-distributed-workspace-bootstrap.sh: validates control.workspace_up bootstrap output and workspace desired/actual/drift schema.
  • test-distributed-workspace-drift.sh: forces a desired/actual mismatch and verifies drift + reconcile diagnostics.
  • test-distributed-workspace-matrix.sh: runs failover/reconnect/bootstrap/drift as one matrix entrypoint.
# Run directly
bash test-env/test-distributed-workspace.sh

# Or through make
cd test-env && make test-distributed-workspace
cd test-env && make test-distributed-workspace-bootstrap
cd test-env && make test-distributed-workspace-drift
cd test-env && make test-distributed-workspace-matrix
cd test-env && make test-distributed-workspace-encrypted
cd test-env && make test-distributed-workspace-operator-token
cd test-env && make test-distributed-soak-chaos
cd test-env && make test-spiderweb-control-protocol

Useful env vars:

  • SPIDERWEB_PORT (default 28790)
  • NODE1_PORT (default 28911)
  • NODE2_PORT (default 28912)
  • BIND_ADDR (default 127.0.0.1)
  • SPIDERWEB_CONTROL_OPERATOR_TOKEN (optional; include operator_token in protected mutations if enabled)
  • SPIDERWEB_CONTROL_STATE_KEY_HEX (optional; enables encrypted control-plane snapshot storage)
  • ASSERT_OPERATOR_TOKEN_GATE=1 (optional; assert mutation deny/allow behavior before the main workflow)
  • SPIDERWEB_METRICS_PORT (optional; enables HTTP /livez, /readyz, /metrics (Prometheus), /metrics.json (JSON))
  • SKIP_BUILD=1 to skip zig build if binaries are already built

Unified v2 Protocol Validation

Validates protocol-level contract points used in release checks:

  • control negotiation order (control.version -> control.connect)
  • runtime Acheron negotiation order (acheron.t_version -> acheron.t_attach)
  • standalone FS routing order (acheron.t_fs_hello must come first)
  • standalone FS HELLO auth-token enforcement (--auth-token)
  • source-level envelope/type guard in core client code paths
# Run directly
bash test-env/test-spiderweb-control-protocol.sh

# Or through make
cd test-env && make test-spiderweb-control-protocol

Useful env vars:

  • SPIDERWEB_PORT (default 28794)
  • FS_NODE_PORT (default 28931)
  • BIND_ADDR (default 127.0.0.1)
  • SKIP_BUILD=1 to skip zig build if binaries are already built

Soak / Chaos Suite

Runs the distributed workspace flow repeatedly with randomized ports and optional auth/encryption modes.

# Run directly
bash test-env/test-distributed-soak-chaos.sh

# Or through make
cd test-env && make test-distributed-soak-chaos

Useful env vars:

  • SOAK_ITERATIONS (default 10)
  • SOAK_ENABLE_OPERATOR_MODE=0|1 (default 1)
  • SOAK_ENABLE_ENCRYPTED_MODE=0|1 (default 1)

Wiping and Restarting

# Stop and remove container (data is lost - this is the point!)
docker-compose down

# Remove the image to force rebuild
docker-compose down --rmi local

# Start fresh
docker-compose up --build

Files

  • Dockerfile - Minimal Debian with dependencies pre-installed
  • docker-compose.yml - Container orchestration
  • test-install.sh - Automated test of the install script