CodeGuarrdian

Inspiration

Modern development teams are flooded with security alerts, logs, and dashboards—but alerts don’t fix bugs. During our own projects, we noticed a recurring pattern: Vulnerabilities are detected early, but remediation is slow, manual, and error-prone. We were inspired by the question: What if security tooling didn’t just warn developers, but actually fixed the problem safely? That idea led to CodeGuard: an autonomous security agent that attacks code, detects vulnerabilities, repairs them in isolation, verifies the fix, and submits a pull request for human review.

What it does

CodeGuard acts like a first-responder security engineer for a codebase: 1)Attacks the application in an isolated Daytona sandbox 2)Detects vulnerabilities via deterministic attack scripts 3)Reports incidents using Sentry for observability 4)Alerts developers with voice notifications (ElevenLabs) 5)Autonomously generates a fix using an LLM 6)Verifies the fix by re-running the same attack 7)Creates a Pull Request only if the vulnerability is truly gone 8)Enforces human-in-the-loop review via CodeRabbit before merge

How we built it

GitHub Push ↓ Daytona Sandbox ↓ Attack Script (SQLi / XSS) ↓ Sentry Event ↓ Orchestrator (FastAPI) ↓ LLM Agent (Fix Generation) ↓ Daytona Verification ↓ Pull Request ↓ CodeRabbit Review

Key Technologies: FastAPI — Vulnerable demo application + Orchestrator Daytona — Secure, ephemeral sandboxes for attack and repair Sentry — Centralized error and security event tracking LLMs (Gemini / OpenAI alternatives) — Patch generation ElevenLabs — Voice alerts simulating on-call notifications GitHub API — Automated PR creation CodeRabbit — AI code review as a mandatory quality gate

Challenges we ran into

Reliable sandbox execution: Running attacks and fixes inside ephemeral Daytona sandboxes introduced challenges around dependency installation, network access, and log visibility. Long-running commands initially appeared “stuck” until we redesigned the runner to stream logs and keep sandbox environments minimal.

Closing the loop safely: Automatically generating code fixes is easy; verifying them safely is not. We had to ensure the same attack that caused the incident was re-run after the fix, and that success was determined by deterministic signals (exit codes), not heuristics.

Agent orchestration vs. observability: Early versions mixed incident ingestion, agent logic, and remediation in one place. We learned to separate concerns clearly: 1)Sentry for observability 2)Orchestrator for coordination 3)Agent for reasoning 4)Remediator for execution 5)Managing secrets in a public demo: With multiple APIs (GitHub, LLMs, Daytona, Sentry), we had to be careful not to leak credentials, especially when running untrusted code in sandboxes and exposing services via public tunnels.

Accomplishments that we're proud of:

A true closed-loop security system: CodeGuardian doesn’t stop at detection. It attacks, detects, fixes, verifies, and submits a pull request—end to end. Safe, isolated remediation: All attacks and fixes happen in Daytona sandboxes, ensuring AI-generated code never touches production or developer machines directly. Human-in-the-loop by design : Even after autonomous verification, fixes are gated by CodeRabbit and GitHub branch protection. The system is explicitly forbidden from merging code on its own. Observable and demo-ready : We built clear incident logs, streamed sandbox logs, and exposed simple APIs for a UI—making the system easy to understand, debug, and present. No fake automation :Every “success” is backed by a real re-run of the attack script. If the vulnerability still exists, the agent fails.

What we learned

Autonomy requires verification:AI-generated fixes are only trustworthy when paired with deterministic, repeatable tests. Isolation enables confidence: Ephemeral environments are the key to safely experimenting with autonomous agents in security-critical workflows. Simple flows beat complex ones: Scoping the demo to one app and one vulnerability allowed us to build something real instead of a fragile, overgeneralized system. AI works best with guardrails: The most responsible use of AI in DevSecOps is not full automation, but automation with enforced stopping points.

What's next for CodeGuarrdian

Smarter agent reasoning: Using incident history and previous fixes to improve future remediation quality. Policy-based reviews: Integrating CodeRabbit feedback into agent decision-making (e.g., retry fix if review fails). GitHub App deployment: Turning CodeGuardian into a drop-in GitHub App that teams can enable with one click.