π Inspiration
Most programming languages have mature linting ecosystems β tools that enforce best practices, prevent bugs, and improve production readiness. But agent frameworks like LangGraph lacked an equivalent.
We asked ourselves:
Why donβt agent graphs have linters that catch mistakes early and promote safe, reliable design?
InspectorAI was built to fill that gap.
π§ What InspectorAI Does
- Detects problematic functions using an SLM-classifier trained to identify fallible, side-effecting, or risky behaviors.
- Runs static analysis to pinpoint exact lines, nodes, or graph edges responsible for the issue.
- Automatically fixes errors using token-optimized LLM strategies.
- Re-lints and verifies the updated code to ensure no regressions.
βοΈ How It Works
1. Detection Layer (SLM + heuristics)
An SLM scans function docstrings and code to classify them into categories like:
- fallible
- has side effects
- potentially dangerous
This determines which parts of the code need deeper inspection.
2. Static Analysis Engine
InspectorAI analyzes:
- graph wiring issues
- dangerous or fallible functions
- unsafe side effects
It generates a compact representation of the problem for downstream agents.
3. Multi-Agent Repair System
- A coordinator agent decides which repair strategy to use.
- A fixer agent rewrite, patch, or refactor the code.
- A verification agent re-checks all issues to ensure correctness and stability.
4. Token-Optimized Fixing Strategies
Instead of blindly passing entire codebases into an LLM, InspectorAI selects one of three targeted modes:
- Snippet Fixes β for small, local errors
- Graph-Context Fixes β for wiring issues
- Full-Context Fixes β only when necessary for multi-error, complex scenarios
This keeps costs low and accuracy high.
π What We Learned
- The right context beats more context. Fix quality improved dramatically when we optimized the scope of what we send to the LLM.
- Hybrid systems are more robust. Combining static analysis with SLMs and LLMs outperformed any single approach.
- Verification is essential. Automatic fixes require a re-lint to avoid regressions.
- Agent orchestration matters. Even a small rule-based controller improved stability and cost-efficiency.
π§© Challenges We Faced
- Token efficiency: Preventing LLM calls from exploding in cost while keeping context relevant.
- Graph serialization: Representing LangGraph nodes/edges in a compact and meaningful way.
- False positives: Requiring confidence thresholds and cross-checks with the static layer.
- Maintaining correctness: Ensuring fixes donβt break intended behavior or introduce new errors.
- Workflow complexity: Coordinating multiple agents in a predictable way.
Built With
- langgraph
- nim
- python
- react
- typescript
Log in or sign up for Devpost to join the conversation.