⚖️JurAI: Multi-Agent Compliance Judge
💡Problem Statement
Automating Geo-regulation with LLM. Every product rollout risks non-compliance with region-specific laws, exposing companies to legal and reputational risks.
🛠️Functionality and Features
JurAI is a multi-agent compliance pipeline that simulates a jury of AI models. Each juror reviews a product feature, a critic challenges its reasoning, and a judge consolidates the final verdict. The system integrates Retrieval-Augmented Generation (RAG) to ground reasoning in real legislation as well as get additional context from past context.
Key Features
1. Dual RAG Context Retrieval
- The first RAG, retrieves past verdicts for similar features
- The second RAG, retrieves Relevant Region-Specific Legislation
2. Jury–Critic–Judge Pipeline
- Multiple Jury Agents analyze the feature independently
- Each Jury has its own Critic Agent that reviews, points out weaknesses, and forces revisions
- A Judge Agent merges jury outputs, removes duplicates, and delivers one clean verdict
- Runs with five parallel jurors and critics to refine compliance reports iteratively
3. Diversity of Models
- Juries do not all run on the same LLM (e.g., one on DeepSeek, another on GPT-5-mini)
- Critics and Judge can also mix models
- Prevents a single model’s blind spots from dominating the outcome !!
4. Region-Aware Compliance
- Supports EU and US jurisdictions (California, Florida, Utah, National)
- Uses real ingested legislation for region-specific analysis
5. Structured and Consistent Outputs
- Always produces structured JSON, never free-text
- Every statement backed by a citation for auditability
- Easy for both lawyers and automated pipelines to parse
- Terminology glossary dynamically updated to keep reasoning consistent
6. Real-Time Streaming
- Deliberation results are streamed live via Server-Sent Events (SSE) for immediate visibility and transparency
Results
- The results for the given dataset are uploaded as a .csv file in the next section
🧑💻Technologies Used
Frontend
- Next.js + TypeScript – interactive web-based UI for submitting features and viewing jury reports
Backend
- FastAPI (Python) – REST API for feature detail processing and jury execution
- sse-starlette – live-streaming of jury decisions
AI/ML Core
- Google ADK Agents SDK – orchestration of jurors, critics, and judge
- Google GenAI SDK – content handling & pipeline communication
- OpenAI GPT-5 Mini – final judge + critic
- DeepSeek Chat – juror agent
- Qwen 3 235B, LLaMA 4 Maverick, Moonshot Kimi K2, Gemini 2.5 Flash – diverse juror/critic pairings for cross-model deliberation
- LightRAG - past case ingestion and retrieval
- LangChain-tools powered RAG - ingestion and retrival of legislative documents
Hosting and Deployment
- Render - hosts backend server
- Vercel - deploys frontend web-UI
📂Assets
Region-specific legislation files ingested into RAG
- digital_services_act_wiki.txt (EU DSA)
- USCA_SB976.txt (California SB976)
- florida_state_law.txt (Florida privacy law)
- Utah Social Media Regulation Act - Wikipedia.html
- US law on reporting child sexual abuse content to NCMEC.txt
Reflection
🔨Challenges & Reflection
- Designing a multi-agent debate structure that avoids infinite loops and is efficient yet still refines reasoning.
- Managing RAG ingestion and retrieval pipelines across different regions and heterogeneous data formats (Wikipedia text, legislation PDFs, HTML).
- Coordinating outputs from five different LLM families into a consistent final verdict.
- Debugging streaming issues between frontend (Nextjs) and backend (FastAPI).
- Debugging deployment related issues
🚀Next Steps
- Expand to More Regions – extend compliance coverage beyond US/EU to APAC, LATAM, and other emerging regulatory markets.
- Benchmark Against Human Reviews – compare jury verdicts with real compliance lawyer assessments to measure accuracy and reliability.
- Batch Processing – support bulk feature submissions to reduce latency and lower API costs.
- Interactive Feedback Loop – allow users to refine verdicts by providing additional context, clarifications, or corrections.
- Customizable Jury – let users select which LLMs (OpenAI, DeepSeek, Qwen, LLaMA, Moonshot, Gemini) power their juries and critics directly from the UI.
Built With
- fastapi
- google-adk
- langchain
- lightrag
- nextjs
- python
- render
- typescript
- vercel
Log in or sign up for Devpost to join the conversation.