NeoLayer

Semantic intent firewall for autonomous agents.

The Problem

Autonomous AI agents can drift from their stated intent: slow intent drift, prompt injection, and privilege escalation—masked as normal workflow.

Intent drift — Agents shift toward unintended goals. No single action looks malicious.
Prompt injection — Inputs steer agents toward unintended goals.
Post-incident isn’t enough — You need prevention and explainable decisions.

How It Works

Every tool call is logged, recent history summarized, intent inferred with rich semantic context, drift calculated, and policy (allow/warn/block) enforced with full explainability.

Session created with baseline intent.

Baseline intent

Every tool call intercepted and logged.

Tool call logged

Action history summarized; semantic context built.

Semantic context

LLM infers current intent; drift calculated.

Intent inference

Policy evaluated; risk breakdown + explanation produced.

Policy + explainability

Decision enforced and recorded.

Allow / Warn / Block

Demo: Simulated Agent

Enter a natural-language task. NeoLayer spawns a simulated agent session, executes a stepwise plan through the same firewall as real agents, and shows intent drift and policy decisions in real time.

Agent task

Simulation preset

What should the agent try to do?

Agent persona Risk tolerance Environment

Agent plan (steps the agent will run)

What NeoLayer thinks the agent is doing

Current inferred goal—

Change in drift (this step)—

Total drift (so far)—

Why NeoLayer intervenes

Drift: 0.00 — ALLOW

Decision explanation

Plain English summary of why NeoLayer allowed, warned, or blocked this step.

Technical details (risk breakdown and signals)

Intent Drift Detected

Action blocked or warned. See Policy Explanation below.

Review

Baseline intent—

Inferred intent—

—

Trace Replay (evaluation mode: trace-replay)

Replay a recorded agent trace through NeoLayer. No real tools are called; observations come from the trace. Mode is persisted in session metadata and shown here.

Trace JSON (canonical format)