πŸš€ Inspiration

Most programming languages have mature linting ecosystems β€” tools that enforce best practices, prevent bugs, and improve production readiness. But agent frameworks like LangGraph lacked an equivalent.
We asked ourselves:

Why don’t agent graphs have linters that catch mistakes early and promote safe, reliable design?

InspectorAI was built to fill that gap.


🧠 What InspectorAI Does

  • Detects problematic functions using an SLM-classifier trained to identify fallible, side-effecting, or risky behaviors.
  • Runs static analysis to pinpoint exact lines, nodes, or graph edges responsible for the issue.
  • Automatically fixes errors using token-optimized LLM strategies.
  • Re-lints and verifies the updated code to ensure no regressions.

βš™οΈ How It Works

1. Detection Layer (SLM + heuristics)

An SLM scans function docstrings and code to classify them into categories like:

  • fallible
  • has side effects
  • potentially dangerous

This determines which parts of the code need deeper inspection.

2. Static Analysis Engine

InspectorAI analyzes:

  • graph wiring issues
  • dangerous or fallible functions
  • unsafe side effects

It generates a compact representation of the problem for downstream agents.

3. Multi-Agent Repair System

  • A coordinator agent decides which repair strategy to use.
  • A fixer agent rewrite, patch, or refactor the code.
  • A verification agent re-checks all issues to ensure correctness and stability.

4. Token-Optimized Fixing Strategies

Instead of blindly passing entire codebases into an LLM, InspectorAI selects one of three targeted modes:

  1. Snippet Fixes β€” for small, local errors
  2. Graph-Context Fixes β€” for wiring issues
  3. Full-Context Fixes β€” only when necessary for multi-error, complex scenarios

This keeps costs low and accuracy high.


πŸ“š What We Learned

  • The right context beats more context. Fix quality improved dramatically when we optimized the scope of what we send to the LLM.
  • Hybrid systems are more robust. Combining static analysis with SLMs and LLMs outperformed any single approach.
  • Verification is essential. Automatic fixes require a re-lint to avoid regressions.
  • Agent orchestration matters. Even a small rule-based controller improved stability and cost-efficiency.

🧩 Challenges We Faced

  • Token efficiency: Preventing LLM calls from exploding in cost while keeping context relevant.
  • Graph serialization: Representing LangGraph nodes/edges in a compact and meaningful way.
  • False positives: Requiring confidence thresholds and cross-checks with the static layer.
  • Maintaining correctness: Ensuring fixes don’t break intended behavior or introduce new errors.
  • Workflow complexity: Coordinating multiple agents in a predictable way.

Built With

Share this project:

Updates