NeoLayer

Semantic intent firewall for autonomous agents.

The Problem

Autonomous AI agents can drift from their stated intent: slow intent drift, prompt injection, and privilege escalation—masked as normal workflow.

  • Intent drift — Agents shift toward unintended goals. No single action looks malicious.
  • Prompt injection — Inputs steer agents toward unintended goals.
  • Post-incident isn’t enough — You need prevention and explainable decisions.

How It Works

Every tool call is logged, recent history summarized, intent inferred with rich semantic context, drift calculated, and policy (allow/warn/block) enforced with full explainability.

1

Session created with baseline intent.

Baseline intent

2

Every tool call intercepted and logged.

Tool call logged

3

Action history summarized; semantic context built.

Semantic context

4

LLM infers current intent; drift calculated.

Intent inference

5

Policy evaluated; risk breakdown + explanation produced.

Policy + explainability

6

Decision enforced and recorded.

Allow / Warn / Block

Demo: Simulated Agent

Enter a natural-language task. NeoLayer spawns a simulated agent session, executes a stepwise plan through the same firewall as real agents, and shows intent drift and policy decisions in real time.

Agent task

Agent plan (steps the agent will run)

    What NeoLayer thinks the agent is doing

    Current inferred goal
    Change in drift (this step)
    Total drift (so far)

    Why NeoLayer intervenes

      Drift: 0.00 — ALLOW

      Decision explanation

      Plain English summary of why NeoLayer allowed, warned, or blocked this step.

        Technical details (risk breakdown and signals)
        
                      
        !
        Intent Drift Detected

        Action blocked or warned. See Policy Explanation below.

        Review

        Baseline intent
        Inferred intent

        Trace Replay (evaluation mode: trace-replay)

        Replay a recorded agent trace through NeoLayer. No real tools are called; observations come from the trace. Mode is persisted in session metadata and shown here.