IntentGuard

Front page of website
Simulation example page of website after running
Simulation example continued and the review
Problem statement page of website
How it works page of website
Architecture diagram
Architecture diagram continued
Why Gemini page of website

Inspiration

As AI agents become more autonomous, browsing the web, calling APIs, modifying files, and making decisions, small errors or prompt injections can quietly push them into unsafe or malicious behavior over time. Today, most safety tooling focuses on filtering individual prompts or responses, not on how an agent’s behavior drifts across multiple steps. We were inspired by how traditional antivirus software protects operating systems at runtime, continuously monitoring behavior and stopping threats before damage occurs. We wanted to bring the same concept to autonomous AI systems, a lightweight security layer that keeps agents aligned with their intended purpose and prevents dangerous actions before they happen.

What it does

It continuously monitors agent behavior and tool usage, detects signs of malicious or unsafe drift, and enforces safety policies in real time. When risky behavior is detected, IntentGuard can block, modify, or require approval for actions before they reach external systems.

Key capabilities include:

Runtime monitoring of agent actions and tool calls
Drift detection across multi-step agent behavior
Policy-based enforcement to block unsafe actions
Clear audit logs for visibility and debugging
Easy integration with existing agent frameworks
This allows developers and teams to deploy autonomous agents with greater confidence and control.

How we built it

IntentGuard is a zero-trust Gemini wrapper that intercepts AI tool calls and detects intent drift before execution.

Node.js backend with audit logging (SQLite), session state, and configurable risk thresholds
Gemini-powered intent inference to track agent goals over time
Policy engine that scores risk using intent drift, tool sensitivity, context, and history
Four outcomes: allow, allow-with-evolution, flag-for-review, or block
Lightweight wrapper that integrates without changing agent behavior
Live frontend demo with real-time drift scores and logs

Key choice: using LLM reasoning instead of rule-based classifiers because intent drift is semantic.