The City Permitting AI Agent (CPA) leverages artificial intelligence to streamline a laborious, but critical governmental process by automatically reviewing submitted forms for completeness and accuracy before they reach human officers. CPA has ingested a trove of official municipal and business-specific requirements which are necessary to issue a permit for a new establishment (such as a food truck). This source of knowledge will be used to pre-screen new permit requests.
Using a scorecard mechanism, the system highlights errors, missing information, and compliance gaps so that the requester is given the greatest chance of permit issuance upon official review. Once an optimal score is achieved, the system maintains a human-in-the-loop approach by submitting to a real human to ensure accountability and oversight. The solution enhances operational efficiency, reduces bottlenecks, and improves citizen experience by accelerating the turnaround time for permits without compromising regulatory compliance.
Our demo focuses on Food Truck permits in the city of Denver, Colorado as this is a real-life example, but the concept is easily extended to city permits of all types across the world.
This repository is designed as a comprehensive demonstration platform for building AI applications with Llama Stack. It follows a modular architecture with three main demonstration patterns:
city-permitting-agent/
├── kubernetes/ # Complete deployment manifests
├── ui/ # City Permitting Agent app source code
└── docs/ # Further Documentation assets
The repository includes complete Kubernetes/OpenShift deployment manifests:
- llama-serve: vLLM model servers (GPU-accelerated)
- llama-stack: Core Llama Stack orchestration server
- mcp-servers: Model Context Protocol tool servers for Slack, databases, web search, and K8s operations
- Vue UI: Web UI deployment
- Node.js Server: Hosting server for web UI and connectivity to Llama Stack
- observability: Monitoring and metrics collection
- kustomize: Deployment overlays for different environments
Under development
For production deployments with GPU acceleration and high availability:
- OpenShift Cluster 4.18
Make sure you are oc login'd to the cluster
-
Navigate to the City Permitting Agent root folder and run ./bootstrap.sh
- When prompted, create the MaaS and Slack Secrets and the Smarty Auth ConfigMap
-
Smash that Enter to continue the bootstrap
This will deploy:
- vLLM model servers with GPU acceleration
- Llama Stack orchestration server
- MCP tool servers for enterprise integrations
- Vue web interface
- Observability and monitoring stack
Comprehensive architecture documentation for the AI-powered permit review system:
- 📋 Detailed System Architecture - Complete system components, data flows, and integration points
- 🏗️ High-Level Architecture Overview - Simplified system overview with technology stack
- 🔧 Technical Component Interactions - Detailed service-level architecture and API specifications
Complete development-to-production flow showing how local development scales to enterprise deployment:
- System Overview Diagram - Full development-to-production flow
- OpenShift Production Architecture - Production deployment details
📚 View All Documentation - Complete documentation index with guides and architecture diagrams
The below diagram shows the secure Llama Stack application architecture deployed on OpenShift using MCP tools and Milvus vector database for agentic and RAG workflows:
