Inspiration
Permitting is a maze of PDFs, municipal quirks, and “call the office” dead-ends. Humans can do it, but it’s slow, brittle, and error-prone. I wanted an agentic workflow that takes a plain-English goal (e.g., “install a 3-ton split HVAC at 123 Main St”) and does the grunt work: find the right authority, the right permit(s), the rules, fees, forms, and then push the process forward—looping a human in only when judgment or signatures are required.
What it does (high level)
Address → Authority Geocodes the address, resolves jurisdiction (city/town/county), and maps to the correct permitting authority.
Intent → Permit Requirements Grok orchestrates specialist sub-agents (Research, Extraction, Compliance) to determine:
- Whether a permit is required
- Which permit(s) and forms
- Fees, inspections, plan sets, stamps, and timing
Evidence Pack Agents compile citations (ordinances, handbook pages, web captures), structured requirements, and a step list the permit officer can sanity-check in one pass.
Execution Loop Multiple task agents attempt: account creation, form prefill, document checklist, scheduling reminders—escalating to the user only for signatures, payments, or ambiguous items.
Observability + Memory Every step is logged, scored, and attributed (who/what/why), so the workflow can be audited or replayed.
How I built it
Agentic graph (LangGraph-style) with Grok as the planner and a few tight, specialized tools:
- Planner (Grok): decomposes the goal, assigns tasks, and picks tools.
- Jurisdiction Resolver: geocoding + FIPS mapping → authority directory.
- Research Agent: constrained web browsing; scrapes ordinances, checklists, fee schedules; snapshots pages.
- Extraction Agent: parses PDFs/HTML into a normalized schema (requirements, fees, forms, deadlines).
- Verifier Agent: cross-checks sources and flags conflicts for human review.
- Prefill Agent: composes forms, gathers docs, prepares uploads.
- Notifier Agent: pings the user when human input is mandatory (sign, pay, notarize).
Data & storage
- A small knowledge graph per authority (requirements, forms, URLs, last-verified timestamp).
- Document store for page snapshots and PDFs, tied to citations.
Observability & control
- Langfuse for traces, spans, prompt/response versions, and success metrics.
- Feature gates to flip between headless browser strategies, parser variants, or model prompts.
Human-in-the-loop
- A compact “Permit Officer” review screen: evidence, diffs, confidence scores, and one-click approvals.
Architecture sketch
[User Goal]
│
▼
[Planner (Grok)]
│ task graph
├────────► [Jurisdiction Resolver] ──► Authority
├────────► [Research Agent] ─────────► Sources + Snapshots
├────────► [Extraction Agent] ───────► Structured Requirements
├────────► [Verifier Agent] ─────────► Conflicts / Confidence
├────────► [Prefill Agent] ──────────► Forms + Checklists
└────────► [Notifier Agent] ─────────► Human Inputs (sign/pay)
All steps emit ► [Langfuse Traces] + [Evidence Store]
Permit Officer UI ▲ reviews & approves before submission
A tiny bit of math (how we score confidence)
For each requirement ( r ) found across sources ( s \in S ), we compute:
[ \text{conf}(r) = \frac{\sum_{s \in S_r} w_s \cdot \mathbb{1}{\text{claim}s = r}}{\sum{s \in S_r} w_s}, \quad w_s = \alpha \cdot \text{recency}_s + (1-\alpha)\cdot \text{authority}_s ]
We surface ( r ) if ( \text{conf}(r) \ge \tau ), otherwise route to human review.
We also estimate time-savings:
[ \text{Savings} = \frac{T_{\text{manual}} - T_{\text{agent}}}{T_{\text{manual}}} \times 100% ]
Challenges
- Municipal heterogeneity: every city names forms differently and hides them in odd places. We solved this by schema-first extraction + robust fallbacks.
- Fragile websites: rate limits, broken HTML, and shifting URLs. Headless browsing with automatic retries and snapshot-based citations helped.
- Ambiguity: conflicting requirements across sources. We added the Verifier Agent and a “conflict set” review step.
- Trust & auditability: decision logs, deterministic prompts, and evidence links are non-negotiable for compliance.
What I learned
- Agent specialization beats one giant prompt. Narrow tools + a planner are more reliable.
- Observability is everything. Without traces and metrics, “agentic” feels like magic; with them, it’s just software.
- Human checkpoints increase speed. A 2-minute review early avoids 2-week backtracks later.
- Great UX > clever prompts. The Permit Officer screen made the system feel legitimate and safe.
What’s next
- Deeper authority coverage: auto-refresh policies, fee updates, and inspection calendars.
- Autofill integrations: direct e-file where APIs exist; robust RPA where they don’t.
- Learning loop: failed submissions feed back into prompts, patterns, and authority heuristics.
- Cost model: estimate fees + expected inspection count from historicals.
Built with
- Model/Orchestration: Grok, LangGraph (agent graph), function/tool calling
- Observability: Langfuse (traces, evals, dashboards)
- Scraping & Headless: Playwright / Puppeteer, Readability pipelines, PDF parsers
- Data: Postgres (requirements, runs, metrics), S3-compatible blob storage (snapshots)
- APIs: Geocoding (Census/Google), authority directories, email + webhook notifications
- Backend: FastAPI + Celery (queues), Redis (tasks, rate limits)
- UI: Next.js + Tailwind (Permit Officer review, audit trails)
- Security: OAuth for user accounts, encrypted secret storage, signed snapshot URLs
Example prompt→result
Input: “Install 3-ton split HVAC at 397 S Malpais Ln, Flagstaff, AZ 86001.” Output: Authority = City of Flagstaff; Permit = Mechanical (Residential); Required: application form MF-102, load calc, site plan, contractor license, fee schedule §4.12; inspections = rough-in + final; estimated fees $X; user actions: e-signature + payment; evidence links attached.
Built With
- audit-trails)-security:-oauth-for-user-accounts
- authority-directories
- dashboards)-scraping-&-headless:-playwright-/-puppeteer
- email-+-webhook-notifications-backend:-fastapi-+-celery-(queues)
- encrypted-secret-storage
- evals
- function/tool-calling-observability:-langfuse-(traces
- langgraph-(agent-graph)
- metrics)
- model/orchestration:-grok
- pdf-parsers-data:-postgres-(requirements
- rate-limits)-ui:-next.js-+-tailwind-(permit-officer-review
- readability-pipelines
- redis-(tasks
- runs
- s3-compatible-blob-storage-(snapshots)-apis:-geocoding-(census/google)
- signed
- snapshot

Log in or sign up for Devpost to join the conversation.