Inspiration

Permitting is a maze of PDFs, municipal quirks, and “call the office” dead-ends. Humans can do it, but it’s slow, brittle, and error-prone. I wanted an agentic workflow that takes a plain-English goal (e.g., “install a 3-ton split HVAC at 123 Main St”) and does the grunt work: find the right authority, the right permit(s), the rules, fees, forms, and then push the process forward—looping a human in only when judgment or signatures are required.


What it does (high level)

  1. Address → Authority Geocodes the address, resolves jurisdiction (city/town/county), and maps to the correct permitting authority.

  2. Intent → Permit Requirements Grok orchestrates specialist sub-agents (Research, Extraction, Compliance) to determine:

  • Whether a permit is required
  • Which permit(s) and forms
  • Fees, inspections, plan sets, stamps, and timing
  1. Evidence Pack Agents compile citations (ordinances, handbook pages, web captures), structured requirements, and a step list the permit officer can sanity-check in one pass.

  2. Execution Loop Multiple task agents attempt: account creation, form prefill, document checklist, scheduling reminders—escalating to the user only for signatures, payments, or ambiguous items.

  3. Observability + Memory Every step is logged, scored, and attributed (who/what/why), so the workflow can be audited or replayed.


How I built it

Agentic graph (LangGraph-style) with Grok as the planner and a few tight, specialized tools:

  • Planner (Grok): decomposes the goal, assigns tasks, and picks tools.
  • Jurisdiction Resolver: geocoding + FIPS mapping → authority directory.
  • Research Agent: constrained web browsing; scrapes ordinances, checklists, fee schedules; snapshots pages.
  • Extraction Agent: parses PDFs/HTML into a normalized schema (requirements, fees, forms, deadlines).
  • Verifier Agent: cross-checks sources and flags conflicts for human review.
  • Prefill Agent: composes forms, gathers docs, prepares uploads.
  • Notifier Agent: pings the user when human input is mandatory (sign, pay, notarize).

Data & storage

  • A small knowledge graph per authority (requirements, forms, URLs, last-verified timestamp).
  • Document store for page snapshots and PDFs, tied to citations.

Observability & control

  • Langfuse for traces, spans, prompt/response versions, and success metrics.
  • Feature gates to flip between headless browser strategies, parser variants, or model prompts.

Human-in-the-loop

  • A compact “Permit Officer” review screen: evidence, diffs, confidence scores, and one-click approvals.

Architecture sketch

[User Goal]
   │
   ▼
[Planner (Grok)]
   │ task graph
   ├────────► [Jurisdiction Resolver] ──► Authority
   ├────────► [Research Agent] ─────────► Sources + Snapshots
   ├────────► [Extraction Agent] ───────► Structured Requirements
   ├────────► [Verifier Agent] ─────────► Conflicts / Confidence
   ├────────► [Prefill Agent] ──────────► Forms + Checklists
   └────────► [Notifier Agent] ─────────► Human Inputs (sign/pay)

All steps emit ► [Langfuse Traces] + [Evidence Store]
Permit Officer UI ▲ reviews & approves before submission

A tiny bit of math (how we score confidence)

For each requirement ( r ) found across sources ( s \in S ), we compute:

[ \text{conf}(r) = \frac{\sum_{s \in S_r} w_s \cdot \mathbb{1}{\text{claim}s = r}}{\sum{s \in S_r} w_s}, \quad w_s = \alpha \cdot \text{recency}_s + (1-\alpha)\cdot \text{authority}_s ]

We surface ( r ) if ( \text{conf}(r) \ge \tau ), otherwise route to human review.

We also estimate time-savings:

[ \text{Savings} = \frac{T_{\text{manual}} - T_{\text{agent}}}{T_{\text{manual}}} \times 100% ]


Challenges

  • Municipal heterogeneity: every city names forms differently and hides them in odd places. We solved this by schema-first extraction + robust fallbacks.
  • Fragile websites: rate limits, broken HTML, and shifting URLs. Headless browsing with automatic retries and snapshot-based citations helped.
  • Ambiguity: conflicting requirements across sources. We added the Verifier Agent and a “conflict set” review step.
  • Trust & auditability: decision logs, deterministic prompts, and evidence links are non-negotiable for compliance.

What I learned

  • Agent specialization beats one giant prompt. Narrow tools + a planner are more reliable.
  • Observability is everything. Without traces and metrics, “agentic” feels like magic; with them, it’s just software.
  • Human checkpoints increase speed. A 2-minute review early avoids 2-week backtracks later.
  • Great UX > clever prompts. The Permit Officer screen made the system feel legitimate and safe.

What’s next

  • Deeper authority coverage: auto-refresh policies, fee updates, and inspection calendars.
  • Autofill integrations: direct e-file where APIs exist; robust RPA where they don’t.
  • Learning loop: failed submissions feed back into prompts, patterns, and authority heuristics.
  • Cost model: estimate fees + expected inspection count from historicals.

Built with

  • Model/Orchestration: Grok, LangGraph (agent graph), function/tool calling
  • Observability: Langfuse (traces, evals, dashboards)
  • Scraping & Headless: Playwright / Puppeteer, Readability pipelines, PDF parsers
  • Data: Postgres (requirements, runs, metrics), S3-compatible blob storage (snapshots)
  • APIs: Geocoding (Census/Google), authority directories, email + webhook notifications
  • Backend: FastAPI + Celery (queues), Redis (tasks, rate limits)
  • UI: Next.js + Tailwind (Permit Officer review, audit trails)
  • Security: OAuth for user accounts, encrypted secret storage, signed snapshot URLs

Example prompt→result

Input: “Install 3-ton split HVAC at 397 S Malpais Ln, Flagstaff, AZ 86001.” Output: Authority = City of Flagstaff; Permit = Mechanical (Residential); Required: application form MF-102, load calc, site plan, contractor license, fee schedule §4.12; inspections = rough-in + final; estimated fees $X; user actions: e-signature + payment; evidence links attached.

Built With

  • audit-trails)-security:-oauth-for-user-accounts
  • authority-directories
  • dashboards)-scraping-&-headless:-playwright-/-puppeteer
  • email-+-webhook-notifications-backend:-fastapi-+-celery-(queues)
  • encrypted-secret-storage
  • evals
  • function/tool-calling-observability:-langfuse-(traces
  • langgraph-(agent-graph)
  • metrics)
  • model/orchestration:-grok
  • pdf-parsers-data:-postgres-(requirements
  • rate-limits)-ui:-next.js-+-tailwind-(permit-officer-review
  • readability-pipelines
  • redis-(tasks
  • runs
  • s3-compatible-blob-storage-(snapshots)-apis:-geocoding-(census/google)
  • signed
  • snapshot
Share this project:

Updates