Claw-Jail

The Claw-Jail Dashbord
Our Logo

Inspiration

As Agentic AI continues to evolve, platforms like OpenClaw are pushing the boundaries of what AI systems can achieve. During our research into AI agents, we identified a chance to enhance the user experience by developing more robust security protocols and a centralized monitoring interface. Our goal was to build upon the existing framework to ensure that, as these tools become more autonomous, they remain fully transparent and accountable. By bridging the gap between complex agentic actions and user oversight, we aim to provide a safer environment for everyone exploring the potential of AI agents.

What it does

Operating as a security middleware layer between the AI agent and the host system, Claw-Jail functions as a digital "black box" that monitors an AI’s internal chain of thought and planned actions in real-time. Through a live dashboard, it ensures the agent never executes malicious commands or exceeds its authorization without oversight. The system allows users to define custom watchlists by inputting specific keywords to monitor, while every tool OpenClaw attempts to run is instantly assigned a risk score on a scale of 1 to 100. By adjusting a threshold slider, users can set their preferred security tolerance; if an action’s risk score exceeds this limit, the system automatically flags the process and pauses OpenClaw's execution. This then requires the user to manually approve or reject the flagged tool, providing a definitive safety gate before OpenClaw is permitted to continue its task.

How we built it

Frontend: React + Vite Backend: Python + FastAPI Infrastructure: Docker and GitHub Actions with Continuous Integration AI Integration: OpenClaw (AI Agent), Wispr Flow (Voice Input), and Step 3.5 Flash (Risk Assessment)

Challenges we ran into

Intercepting an agent’s internal logic is very difficult. Our first attempt at building a standard proxy/shim failed because we couldn't "see" the raw commands OpenClaw sent to its internal LLM. Plus, there was no good documentation, so we had to find a different approach. We decided to pivot from a proxy approach to developing a custom log-interception plugin. This allows us to hook directly into the tool-execution pipeline, capturing intent before it turns into action. It was harder to do, but the important part was that we knew it would work.

Accomplishments that we're proud of

Real-Time Interception: Successfully capturing and visualizing OpenClaw's tool-use in real-time. Security-First: Building a system that prioritizes Human-in-the-Loop safety for autonomous agents. Seamless Integration: Maintaining the agent's performance while adding a heavy-duty security layer. Clean Dashboard: We made a dashboard that updates in real time. It even has a button so you can switch between light and dark mode.

What we learned

Building Claw-Jail taught us the complexities of working with AI agents. We gained deeper experience in designing full-stack applications with moving parts that require large amounts of data to be transferred. We also mastered the integration of multimodal models like Wispr Flow, learning how to translate complex AI logs into human-readable insights. Furthermore, we learned about using WebSockets so we could have continuous data going between the frontend and backend.

What's next for Claw-Jail

We hope to polish the dashboard and add more components so the user has more flexibility in monitoring OpenClaw.

Built With

Submitted to

IrvineHacks 2026
- Winner Best AI Safety Hack – AI Safety at UCI

Created by

I worked on the front-end, primarily focusing on fast-whisper and fuzzy string matching. On the backend, I implemented Gemini AI as the fallback for when the software needs it.

Lien Benedict Jabujab
I worked on using React to make the entire frontend dashboard. I also worked on connecting it to the FastAPI backend using web sockets so the dashboard could update in real time.

Sean San
Cole Saldanha
Daniel Hurtarte