Grok Does Permitting

Inspiration

Permitting is a maze of PDFs, municipal quirks, and “call the office” dead-ends. Humans can do it, but it’s slow, brittle, and error-prone. I wanted an agentic workflow that takes a plain-English goal (e.g., “install a 3-ton split HVAC at 123 Main St”) and does the grunt work: find the right authority, the right permit(s), the rules, fees, forms, and then push the process forward—looping a human in only when judgment or signatures are required.

What it does (high level)

Address → Authority Geocodes the address, resolves jurisdiction (city/town/county), and maps to the correct permitting authority.
Intent → Permit Requirements Grok orchestrates specialist sub-agents (Research, Extraction, Compliance) to determine:

Whether a permit is required
Which permit(s) and forms
Fees, inspections, plan sets, stamps, and timing

Evidence Pack Agents compile citations (ordinances, handbook pages, web captures), structured requirements, and a step list the permit officer can sanity-check in one pass.
Execution Loop Multiple task agents attempt: account creation, form prefill, document checklist, scheduling reminders—escalating to the user only for signatures, payments, or ambiguous items.
Observability + Memory Every step is logged, scored, and attributed (who/what/why), so the workflow can be audited or replayed.

How I built it

Agentic graph (LangGraph-style) with Grok as the planner and a few tight, specialized tools:

Planner (Grok): decomposes the goal, assigns tasks, and picks tools.
Jurisdiction Resolver: geocoding + FIPS mapping → authority directory.
Research Agent: constrained web browsing; scrapes ordinances, checklists, fee schedules; snapshots pages.
Extraction Agent: parses PDFs/HTML into a normalized schema (requirements, fees, forms, deadlines).
Verifier Agent: cross-checks sources and flags conflicts for human review.
Prefill Agent: composes forms, gathers docs, prepares uploads.
Notifier Agent: pings the user when human input is mandatory (sign, pay, notarize).

Data & storage

A small knowledge graph per authority (requirements, forms, URLs, last-verified timestamp).
Document store for page snapshots and PDFs, tied to citations.

Observability & control

Langfuse for traces, spans, prompt/response versions, and success metrics.
Feature gates to flip between headless browser strategies, parser variants, or model prompts.

Human-in-the-loop

A compact “Permit Officer” review screen: evidence, diffs, confidence scores, and one-click approvals.

Architecture sketch

[User Goal]
   │
   ▼
[Planner (Grok)]
   │ task graph
   ├────────► [Jurisdiction Resolver] ──► Authority
   ├────────► [Research Agent] ─────────► Sources + Snapshots
   ├────────► [Extraction Agent] ───────► Structured Requirements
   ├────────► [Verifier Agent] ─────────► Conflicts / Confidence
   ├────────► [Prefill Agent] ──────────► Forms + Checklists
   └────────► [Notifier Agent] ─────────► Human Inputs (sign/pay)

All steps emit ► [Langfuse Traces] + [Evidence Store]
Permit Officer UI ▲ reviews & approves before submission

A tiny bit of math (how we score confidence)

For each requirement ( r ) found across sources ( s \in S ), we compute:

[ \text{conf}(r) = \frac{\sum_{s \in S_r} w_s \cdot \mathbb{1}{\text{claim}s = r}}{\sum{s \in S_r} w_s}, \quad w_s = \alpha \cdot \text{recency}_s + (1-\alpha)\cdot \text{authority}_s ]

We surface ( r ) if ( \text{conf}(r) \ge \tau ), otherwise route to human review.

We also estimate time-savings:

[ \text{Savings} = \frac{T_{\text{manual}} - T_{\text{agent}}}{T_{\text{manual}}} \times 100% ]

Challenges

Municipal heterogeneity: every city names forms differently and hides them in odd places. We solved this by schema-first extraction + robust fallbacks.
Fragile websites: rate limits, broken HTML, and shifting URLs. Headless browsing with automatic retries and snapshot-based citations helped.
Ambiguity: conflicting requirements across sources. We added the Verifier Agent and a “conflict set” review step.
Trust & auditability: decision logs, deterministic prompts, and evidence links are non-negotiable for compliance.

What I learned

Agent specialization beats one giant prompt. Narrow tools + a planner are more reliable.
Observability is everything. Without traces and metrics, “agentic” feels like magic; with them, it’s just software.
Human checkpoints increase speed. A 2-minute review early avoids 2-week backtracks later.
Great UX > clever prompts. The Permit Officer screen made the system feel legitimate and safe.

What’s next

Deeper authority coverage: auto-refresh policies, fee updates, and inspection calendars.
Autofill integrations: direct e-file where APIs exist; robust RPA where they don’t.
Learning loop: failed submissions feed back into prompts, patterns, and authority heuristics.
Cost model: estimate fees + expected inspection count from historicals.

Built with

Model/Orchestration: Grok, LangGraph (agent graph), function/tool calling
Observability: Langfuse (traces, evals, dashboards)
Scraping & Headless: Playwright / Puppeteer, Readability pipelines, PDF parsers
Data: Postgres (requirements, runs, metrics), S3-compatible blob storage (snapshots)
APIs: Geocoding (Census/Google), authority directories, email + webhook notifications
Backend: FastAPI + Celery (queues), Redis (tasks, rate limits)
UI: Next.js + Tailwind (Permit Officer review, audit trails)
Security: OAuth for user accounts, encrypted secret storage, signed snapshot URLs

Example prompt→result

Input: “Install 3-ton split HVAC at 397 S Malpais Ln, Flagstaff, AZ 86001.” Output: Authority = City of Flagstaff; Permit = Mechanical (Residential); Required: application form MF-102, load calc, site plan, contractor license, fee schedule §4.12; inspections = rough-in + final; estimated fees $X; user actions: e-signature + payment; evidence links attached.

Built With

audit-trails)-security:-oauth-for-user-accounts
authority-directories
dashboards)-scraping-&-headless:-playwright-/-puppeteer
email-+-webhook-notifications-backend:-fastapi-+-celery-(queues)
encrypted-secret-storage
evals
function/tool-calling-observability:-langfuse-(traces
langgraph-(agent-graph)
metrics)
model/orchestration:-grok
pdf-parsers-data:-postgres-(requirements
rate-limits)-ui:-next.js-+-tailwind-(permit-officer-review
readability-pipelines
redis-(tasks
runs
s3-compatible-blob-storage-(snapshots)-apis:-geocoding-(census/google)
signed
snapshot

Updates

Naim Miah started this project — Dec 10, 2025 02:05 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.