Inspiration
As Agentic AI continues to evolve, platforms like OpenClaw are pushing the boundaries of what AI systems can achieve. During our research into AI agents, we identified a chance to enhance the user experience by developing more robust security protocols and a centralized monitoring interface. Our goal was to build upon the existing framework to ensure that, as these tools become more autonomous, they remain fully transparent and accountable. By bridging the gap between complex agentic actions and user oversight, we aim to provide a safer environment for everyone exploring the potential of AI agents.
What it does
Operating as a security middleware layer between the AI agent and the host system, Claw-Jail functions as a digital "black box" that monitors an AI’s internal chain of thought and planned actions in real-time. Through a live dashboard, it ensures the agent never executes malicious commands or exceeds its authorization without oversight. The system allows users to define custom watchlists by inputting specific keywords to monitor, while every tool OpenClaw attempts to run is instantly assigned a risk score on a scale of 1 to 100. By adjusting a threshold slider, users can set their preferred security tolerance; if an action’s risk score exceeds this limit, the system automatically flags the process and pauses OpenClaw's execution. This then requires the user to manually approve or reject the flagged tool, providing a definitive safety gate before OpenClaw is permitted to continue its task.
How we built it
Frontend: React + Vite Backend: Python + FastAPI Infrastructure: Docker and GitHub Actions with Continuous Integration AI Integration: OpenClaw (AI Agent), Wispr Flow (Voice Input), and Step 3.5 Flash (Risk Assessment)
Challenges we ran into
Intercepting an agent’s internal logic is very difficult. Our first attempt at building a standard proxy/shim failed because we couldn't "see" the raw commands OpenClaw sent to its internal LLM. Plus, there was no good documentation, so we had to find a different approach. We decided to pivot from a proxy approach to developing a custom log-interception plugin. This allows us to hook directly into the tool-execution pipeline, capturing intent before it turns into action. It was harder to do, but the important part was that we knew it would work.
Accomplishments that we're proud of
Real-Time Interception: Successfully capturing and visualizing OpenClaw's tool-use in real-time. Security-First: Building a system that prioritizes Human-in-the-Loop safety for autonomous agents. Seamless Integration: Maintaining the agent's performance while adding a heavy-duty security layer. Clean Dashboard: We made a dashboard that updates in real time. It even has a button so you can switch between light and dark mode.
What we learned
Building Claw-Jail taught us the complexities of working with AI agents. We gained deeper experience in designing full-stack applications with moving parts that require large amounts of data to be transferred. We also mastered the integration of multimodal models like Wispr Flow, learning how to translate complex AI logs into human-readable insights. Furthermore, we learned about using WebSockets so we could have continuous data going between the frontend and backend.
What's next for Claw-Jail
We hope to polish the dashboard and add more components so the user has more flexibility in monitoring OpenClaw.
Log in or sign up for Devpost to join the conversation.