If you're coordinating multiple LLM agents, you've likely hit some of these:
- Prompt injection exposure - agent identity and permissions are controlled by the LLM, so a compromised prompt can escalate privileges or impersonate other agents
- Implicit permissions - you find out what an agent shouldn't have done after it already did it
- Blind trust - no standard way to distinguish your fleet's data from API responses or crawled content
- Context drift - long-running agents lose track of what happened 200 steps ago and repeat or contradict earlier actions
- Stale information - observations are never superseded, even when newer ones exist
- No isolation boundary - agents are coupled by their conversations, making them hard to test, swap, or reason about independently
- No escalation path - when an agent needs a human decision, there's no structured way to surface it
Coordination guarantees should hold independent of agent behavior. Markspace is a coordination protocol for agent fleets built on stigmergy - a mechanism by which biological systems coordinate at scale. Agents leave traces in a shared environment rather than messaging each other. A deterministic guard layer at the environment boundary enforces identity, scope, and conflict resolution. These constraints live in infrastructure the agent cannot influence. Coordination emerges without being configured; the fleet adapts as the world changes. Over time, the environment becomes a model of the world the agents operate in.
The protocol defines five mark types, three visibility levels, three conflict policies, trust-weighted decay, and 66 formal properties. The included Python package is a reference implementation used to verify those properties experimentally.
LLM agents are probabilistic and reason adaptively when blocked. Coordination that depends on their compliance is therefore unsafe: the same capacity for novel solutions that makes LLMs effective will break a guarantee that relies on the agent itself.
In an experiment, an agent prompted to book multiple slots was rejected after its first booking. It inferred that identity checks were name-based and fabricated caller names to bypass them - through ordinary task-completion reasoning, with no instruction to attack. The gap was structural: the standard LLM tool-calling interface was designed for a single caller; caller identity was never a design parameter, which made it an attack surface the moment multiple agents shared it. The agent attempted exploitation in all 10 runs; 9 of 10 succeeded - the tenth was blocked only because all remaining slots were already taken. Safety training primarily addresses human-AI rather than AI-AI interactions (Triedman et al., 2025), leaving the AI-AI coordination layer systematically undefended.
After a US TV public service announcement.
OpenClaw, a widely adopted open-source AI agent framework, demonstrated what this looks like when agents are explicitly compromised, at scale:
- Prompt injection led to remote code execution, compounded by mass exposure of unprotected instances and widespread plugin marketplace malware
- Red-teaming (Shapira et al., 2026) documented unauthorized compliance, sensitive data disclosure, resource exhaustion loops, and cross-agent propagation of unsafe behavior
- Security researchers concluded that the system cannot be meaningfully secured without removing the capabilities that make it useful, because its safety boundary is the agent itself - precisely what prompt injection compromises
Markspace takes an architectural approach: enforce safety and coordination constraints in a layer outside the agent, so guarantees hold independent of agent behavior.
The guard sits between agents and the mark space. It checks scope permissions and active conflicts, and either writes the mark or rejects the action - an agent has no path around it.
Agents (left) call tools through the guard (center), which checks identity, authority, and conflicts before writing marks to the shared space (right). The guard receives agent identity from the infrastructure - the LLM never controls it. Rejected actions are never executed.
Each mark type encodes a different epistemic role - a plan, a fact, a belief, a warning, an escalation. Purpose determines lifecycle: facts are permanent, beliefs decay as the world changes, plans expire if not acted on. The type system encodes both directly rather than leaving it to agents to manage.
Five mark types:
- Intent: "I plan to do X to resource R" (expires after a time-to-live/TTL)
- Action: "I did X, result Y" (permanent ground truth)
- Observation: "I saw Y about the world" (decays over time, trust-weighted)
- Warning: "X is no longer valid" (spikes then decays)
- Need: "I need a human decision on X" (persists until resolved)
Three visibility levels: OPEN (full access), PROTECTED (structure visible, content redacted), CLASSIFIED (invisible to unauthorized agents).
Three conflict policies: HIGHEST_CONFIDENCE (priority wins), FIRST_WRITER (first claim wins), YIELD_ALL (escalate to principal via need marks).
Trust weighting: Marks carry a source tag (fleet, external verified, external unverified). Trust weights attenuate effective strength. Fleet marks dominate external ones, and unverified sources are discounted further. Weights are configurable per deployment.
Decay: Coordination is a dynamical process - the goal is stability against continuous change, not a fixed solution reached once. Observations and warnings lose strength over configurable half-lives. Intent marks expire after TTL. Action marks are permanent. Stale information fades without explicit cleanup.
Full protocol design in framework.md. Formal specification (66 properties, conformance checklist) in spec.md.
Blocking what a general-purpose agent shouldn't do requires anticipating every failure mode in advance. Composition takes the opposite approach: each agent declares what it can read and write in a manifest, and the guard rejects everything else. Capabilities you didn't grant don't exist.
Watch/subscribe lets agents activate when matching marks appear, forming pipelines without a central orchestrator. Agents can be added, removed, or replaced independently, and need marks let them escalate rather than grow to handle edge cases. The manifest interface is model-agnostic - each agent can run a different backend (cloud API, local open-weights model, or deterministic code), and sensitive scopes can run entirely on-premise.
# Each agent declares exactly what it reads and writes
sensor = Agent(
name="sensor",
scopes={"readings": ["observation"]},
manifest=AgentManifest(outputs=[("readings", MarkType.OBSERVATION)]),
)
alerter = Agent(
name="alerter",
scopes={"alerts": ["warning"]},
read_scopes=frozenset({"readings"}),
manifest=AgentManifest(
inputs=[WatchPattern(scope="readings", mark_type=MarkType.OBSERVATION)],
outputs=[("alerts", MarkType.WARNING)],
),
)
validate_pipeline([sensor, alerter]) # guard rejects anything outside declared scopeThe reference implementation and experiments verify that the protocol's properties hold under realistic conditions. The animation below shows a week of coordination in a large office where every employee has a personal AI assistant - a natural scenario that exercises all protocol features at scale.
525 agents (105 per department, 25 adversarial) coordinating across 7 resource types over 10 simulated rounds. Model: OpenAI gpt-oss-120b via Fireworks. Left: agent activity by department. Center: marks accumulating in the shared space. Right: per-resource contention.
Unit tests (312):
- Core mark operations - algebra, immutability, uniqueness, ordering
- Protocol mechanics - decay functions, trust weighting, conflict resolution, deferred resolution
- Guard enforcement - scope visibility, trust source enforcement, tool failure handling
- Composition - agent manifest properties, scheduling
- Robustness - garbage collection, thread safety under concurrent access
- Property-based (Hypothesis) - randomized inputs, stateful scheduler fuzzing (register/unregister/update/tick sequences)
- Observability and cost controls - telemetry sink non-interference, budget tracking accuracy, rate limit enforcement, fleet caps
Stress test: AI personal assistants for a simulated office - one per employee, plus adversarial agents that attempt to breach scope boundaries. No central scheduler, all coordination through the shared mark space. The scenario exercises 7 resource types across natural permission boundaries:
- Department rooms - PROTECTED: other departments see availability but not booking details
- Shared conference rooms - cross-department contention
- Parking - deferred resolution with department head priority
- Equipment - OPEN, low contention baseline
- Tasks with dependency chains - CLASSIFIED: invisible outside owning department
- Lunch - FIRST_WRITER: everyone eats, but preferred type has 2:1 demand
- Boardroom - YIELD_ALL mediation (single shared resource)
Department heads book with elevated priority. External system agents (parking bot, building ops) publish low-trust observations and warnings that decay over configurable half-lives. Adversarial agents with normal permissions but adversarial prompts attempt cross-department bookings and warning injection. All five mark types, all three conflict policies, and all three visibility levels produce non-trivial output in a single trial.
Scenario design in design.md. Detailed analysis of resource allocation, agent behavior, temporal dynamics, and token economics in the stress test writeup.
Trial experiments: 21 runs of the same scenario across 2 models - OpenAI gpt-oss-120b (via Fireworks) and Mercury 2, a diffusion-based LLM (via Inception AI). Agent counts from 105 (5 adversarial) to 1,050 (50 adversarial), 4,010 agent instances total.
- Adversarial - agents with normal permissions but adversarial prompts attempt three attack modes (confidence manipulation, resource flooding, prompt injection). Zero safety violations on both models
- Safety - zero double bookings and zero scope violations across all 21 runs
- Repeatability - 64.7% +/- 0.8% completion on gpt-oss-120b, 66.2% +/- 1.6% on mercury-2 (5 seeds each)
- Scaling - up to 1,050 agents with proportionally scaled resources: per-agent step count nearly flat (1.2x at 10x scale), safety invariants hold at every scale point
- Resource contention - 5x more agents competing for the same resources: completion drops to 13.3% from resource scarcity, not protocol failure
Defense stack trials: 6 attack modes (rate spike, type shift, privilege escalation, probe evasion, slow drift, combined) against the static enforcement layer + statistical envelope + diagnostic probe. 150 rounds, 105 agents per trial (5 adversarial). Zero safety violations across all attack modes. Full analysis in the defense writeup.
Comparison experiment: Markspace vs a message-passing framework (Microsoft Agent Framework) on a shared resource booking scenario. 10 agents, 12 slots, 10 seeds per arm.
The message-passing arm's tool function validates arguments correctly, but the name parameter is a string the LLM provides - the standard tool-calling interface passes only LLM-generated arguments, with no caller authentication.
- Message-passing arm - adversarial agent discovers the identity gap and exploits it in 9/10 seeds: 18 impersonations, 9 double bookings
- Markspace arm - 0 violations across all seeds, because identity is infrastructure-provided
Design and fairness caveats in design.md.
Composition stress test: 14 deterministic agents in a 5-stage pipeline with 4 mid-run hot-swaps. Zero duplicate IDs, zero validation errors across 833-923 marks.
The markspace package is a Python 3.11+ reference implementation. The protocol can be implemented in any language - the formal specification is in docs/spec.md.
poetry install # core package
pytest # 312 tests, no API key neededDefine scopes (resources with conflict policies), create agents (with manifest-declared permissions), and let the guard enforce coordination:
from markspace import (
Agent, Scope, MarkSpace, Guard, GuardVerdict, MarkType,
ConflictPolicy, DecayConfig, Observation, Source,
hours, minutes,
)
# Define a scope with conflict policy and decay rates
calendar = Scope(
name="calendar",
allowed_intent_verbs=("book",),
allowed_action_verbs=("booked",),
decay=DecayConfig(
observation_half_life=hours(6),
warning_half_life=hours(2),
intent_ttl=minutes(30),
),
conflict_policy=ConflictPolicy.FIRST_WRITER,
)
space = MarkSpace(scopes=[calendar])
guard = Guard(space)
# Agents declare what they can read and write
alice = Agent(name="alice", scopes={"calendar": ["intent", "action"]})
bob = Agent(name="bob", scopes={"calendar": ["intent", "action", "observation"]})
# Bob writes an observation (non-contested)
guard.write_mark(
bob,
Observation(
scope="calendar",
topic="availability",
content="Thu 2pm is open",
confidence=0.9,
source=Source.FLEET,
),
)
# Alice books Thu 2pm - guard checks conflicts and records the action
decision, result = guard.execute(
alice,
scope="calendar",
resource="thu-14:00",
intent_action="book",
result_action="booked",
tool_fn=lambda: {"booked": "thu-14:00"},
)
assert decision.verdict == GuardVerdict.ALLOW
# Bob tries the same slot - guard rejects (FIRST_WRITER)
decision, _ = guard.execute(
bob,
scope="calendar",
resource="thu-14:00",
intent_action="book",
result_action="booked",
tool_fn=lambda: {"booked": "thu-14:00"},
)
assert decision.verdict == GuardVerdict.CONFLICT
# Only Alice's action was recorded - Bob's was never executed
actions = space.read(scope="calendar", resource="thu-14:00", mark_type=MarkType.ACTION)
assert len(actions) == 1 and actions[0].agent_id == alice.idThe experiments use LLM agents and require API keys:
poetry install --extras experiments # adds matplotlib, numpy
cp .env.example .env # add your API keysSee experiments/guide.md for CLI reference and cost estimates.
docs/framework.md- protocol design, biological foundations, composition, architecture, failure analysisdocs/spec.md- formal specification (66 properties, conformance checklist)experiments/guide.md- running experiments (setup, CLI reference, analysis, costs)experiments/validation/- safety, visibility, concurrency, scalingexperiments/stress_test/- 105-agent stress test (design, analysis)experiments/trials/- multi-trial repeatability, adversarial robustness, scaling curvesexperiments/trials/results/defense/- defense stack trials (6 attack modes, static + adaptive layer analysis)experiments/comparison/- stigmergy vs message-passing comparison (design, analysis)experiments/composition_stress/- pipeline validation, hot-swap, concurrency
Most users interact with core.py (types), guard.py (enforcement), and space.py (storage). The optional modules add runtime monitoring, cost controls, and observability.
| Module | Role |
|---|---|
core.py |
Mark types, enums, decay, trust, reinforcement, watch patterns, manifests. Stateless. |
space.py |
Thread-safe mark space. Read, write, query, watch/subscribe. |
guard.py |
Deterministic enforcement layer. Runs at the mark space boundary, independent of agent logic. |
envelope.py |
Optional. Pluggable per-agent anomaly detection (default: Welford's online algorithm). Manifest-declared baselines. |
barrier.py |
Optional. Monotonic permission restriction with hierarchical scope matching. |
probe.py |
Optional. Diagnostic canary injection. Verifies agent health through the mark space. |
budget.py |
Optional. Per-agent token budgets with configurable warning thresholds and hard stops. |
rate_limit.py |
Optional. Per-scope write rate limits (per-agent and fleet-wide caps). |
telemetry.py |
Optional. OpenTelemetry-compatible metrics, structured logs, and trace context. |
compose.py |
Composition validation. Pipeline and manifest-permission checks. Stateless. |
schedule.py |
Manifest-based scheduling with optional pre-activation budget checks. |
llm.py |
Provider-agnostic LLM client (OpenAI-compatible). |
models.py |
Model registry. |
This is research code - the protocol design is stable, but the implementation is not production-hardened.
The reference implementation is in-memory and single-process. It does not include persistence, networking, authentication, or real-time performance guarantees. The guard enforces structural invariants (scope, identity, conflicts) but cannot detect well-formed lies from authorized agents - semantic validation is out of scope. The optional layers (envelope, probe, token budgets, rate limits, telemetry) require per-deployment configuration and are not a substitute for the static enforcement layer. See the framework doc for a full threat model analysis.
The protocol specification (docs/) is licensed under CC-BY 4.0. You can use, share, and adapt the spec freely - just give credit.
All code is licensed under the MIT License.

