Production-grade governance toolkit for enterprise LLM deployments -- RBAC, prompt-injection detection, PII redaction, audit logging, eval harness, and multi-cloud LLM routing, all wired into a single FastAPI + React application with Kubernetes-ready infrastructure.
All rollout data and review scenarios are synthetic. The goal is a reviewable, runnable validation kit that demonstrates enterprise-grade LLM governance patterns end to end.
Demo video: https://youtu.be/yMq03b0js0E
Every request passes through four governance layers before reaching the LLM. Each layer can short-circuit the request with a policy refusal, and every decision is audit-logged.
flowchart TB
subgraph Client["Client Layer"]
UI["React + Vite UI"]
CLI["API / CLI Consumer"]
end
subgraph Gateway["API Gateway"]
ING["TLS Ingress"] --> AUTH["JWT / OIDC Auth"]
AUTH --> RL["Rate Limiter"]
end
subgraph Governance["Governance Pipeline -- 4 Layers"]
direction TB
RBAC["1. RBAC Enforcement<br/><i>Employee / Ops / Admin</i>"]
INJ["2. Prompt Injection Detection<br/><i>keyword + regex heuristics</i>"]
PII["3. PII Redaction<br/><i>email, phone, ID masking</i>"]
SAFE["4. Safety Policy Engine<br/><i>22 regex patterns, ReDoS-safe</i>"]
RBAC --> INJ --> PII --> SAFE
end
subgraph Core["Application Core"]
RAG["RAG Retrieval<br/><i>ChromaDB + hash embeddings</i>"]
LLM["LLM Router<br/><i>OpenAI / Ollama / Bedrock / Stub</i>"]
TOOLS["Tool Executor<br/><i>allowlisted tools only</i>"]
RAG --> LLM
LLM --> TOOLS
end
subgraph Observability["Observability + Persistence"]
AUDIT["Audit Logger<br/><i>SHA-256 enterprise hashing</i>"]
PROM["Prometheus Metrics<br/><i>latency, tokens, cost, policy</i>"]
OTEL["OpenTelemetry<br/><i>OTLP traces</i>"]
DD["Datadog Integration"]
end
subgraph DataPlatform["Data Platform Integration"]
SF["Snowflake<br/><i>eval + audit persistence</i>"]
DB["Databricks<br/><i>MLflow + Delta Lake</i>"]
EVAL["Eval Harness<br/><i>baseline diffs, red-team</i>"]
end
UI & CLI --> ING
RL --> RBAC
SAFE --> RAG
LLM --> AUDIT & PROM & OTEL
AUDIT --> SF & DB
OTEL --> DD
EVAL --> SF & DB
style Governance fill:#fff3e0,stroke:#e65100
style Observability fill:#e8f5e9,stroke:#2e7d32
style DataPlatform fill:#e3f2fd,stroke:#1565c0
| Layer | Technologies |
|---|---|
| Backend API | Python 3.11+, FastAPI, Uvicorn, Pydantic |
| Frontend | React 18, Vite |
| Auth | JWT (HS256) with key rotation, OIDC (RS256) via JWKS discovery |
| RAG | ChromaDB, deterministic hash embeddings, in-memory fallback |
| LLM Providers | OpenAI, Ollama, AWS Bedrock, stub (offline deterministic) |
| Eval | Custom harness -- accuracy, groundedness, helpfulness, safety scoring |
| Storage | SQLite, Chroma, JSONL event logs |
| Data Platform | Snowflake (eval + audit), Databricks (MLflow + Delta Lake) |
| Observability | Prometheus, Grafana, OpenTelemetry (OTLP), Datadog-ready |
| Infrastructure | Docker Compose, Kubernetes (HPA, TLS ingress, AlertManager), Terraform (AWS + GCP) |
| CI/CD | GitHub Actions (CI, security scan, Docker publish, Cloudflare Pages deploy) |
| Security | pip-audit, Bandit, Trivy container scanning |
git clone https://github.com/KIM3310/enterprise-llm-adoption-kit.git
cd enterprise-llm-adoption-kit
make demo-local # auto-selects Ollama if available, otherwise stub# 1. Backend
cd app/backend
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
python3 -m app # starts on http://localhost:8000
# 2. Frontend (separate terminal)
cd app/frontend
npm install && npm run dev # starts on http://localhost:5173
# 3. Verify everything
make verify # syntax, deps, pytest, smoke, frontend buildcd infra && docker-compose up --build
# Backend: http://localhost:8000 Frontend: http://localhost:5173kubectl create namespace llm-adoption
kubectl apply -f infra/k8s/secret.yaml
kubectl apply -f infra/k8s/| Capability | Description | Code |
|---|---|---|
| RBAC | Role-to-access-group mapping enforced at RAG retrieval time (Employee / Ops / Admin) | app/backend/app/rbac.py |
| Prompt Injection Detection | Keyword heuristics flag known injection patterns before LLM invocation | app/backend/app/injection.py |
| Safety Policy Engine | 22 regex patterns targeting exfiltration, escalation, and adversarial prompts (ReDoS-safe) | app/backend/app/safety.py |
| PII Redaction | Email, phone, and ID masking with per-category event tracking | app/backend/app/redaction.py |
| Audit Logging | Structured JSON logs with SHA-256 hashing in enterprise mode and auto-retention pruning | app/backend/app/audit.py |
| RAG Retrieval | ChromaDB + deterministic hash embeddings with RBAC-filtered document access | app/backend/app/rag.py |
| Eval Harness | Accuracy, groundedness, helpfulness, safety scoring with baseline diffs and red-team datasets | evals/runner/ |
| LLMOps Metrics | Request latency, token counts, cost tracking, policy events via Prometheus | app/backend/app/metrics.py |
| Circuit Breaker | LLM provider failure isolation with configurable threshold and cooldown | app/backend/app/llm_adapter.py |
| Multi-provider LLM | Hot-swappable runtime config across OpenAI, Ollama, Bedrock, and stub backends | app/backend/app/llm_adapter.py |
| Snowflake Integration | Eval results and audit logs persisted to Snowflake (env-var gated) | app/backend/app/snowflake_adapter.py |
| Databricks Integration | MLflow experiment tracking and Delta audit tables in Unity Catalog | app/backend/app/databricks_adapter.py |
| OpenTelemetry | OTLP trace export with Datadog-ready integration | app/backend/app/telemetry.py |
| Endpoint | Description |
|---|---|
POST /auth/login |
Issue JWT (local_jwt or OIDC mode) |
POST /uc1/architecture |
LLM-assisted architecture query (RBAC-gated RAG) |
POST /uc2/log-intel |
Log intelligence and root cause generation |
GET /audit/summary |
Governance and audit log summary |
GET /metrics |
Prometheus metrics (requests, latency, tokens, cost, policy events) |
GET /health |
Runtime posture and diagnostics |
GET /ops/service-brief |
Operational service brief and readiness summary |
GET /ops/resource-pack |
Resource pack for review and evidence |
Snowflake -- set SNOWFLAKE_ACCOUNT to activate. Stores eval results and audit logs; supports query_eval_history(), query_audit_history(), aggregate reporting.
Databricks -- set DATABRICKS_HOST to activate. MLflow experiment tracking per eval run; Delta audit tables in Unity Catalog; databricks-cli or service-principal OAuth auth.
See .env.example for the full configuration surface.
- Datadog-ready resource pack:
docs/datadog/README.md - Existing env hooks already reserve a Datadog integration lane in
.env.example - Current state: asset sync and OTLP wiring are prepared, but live tenant integration is intentionally disabled by default
- Best use: show how enterprise LLM governance, audit, latency, and rollout readiness would be observed in one operator-facing Datadog surface
enterprise-llm-adoption-kit/
app/
backend/ # FastAPI backend (RBAC, RAG, safety, audit, LLM adapters)
app/
main.py # FastAPI app with governance middleware
rbac.py # Role-to-access-group mapping
safety.py # 22-pattern safety policy engine
injection.py # Prompt injection detection
redaction.py # PII redaction engine
audit.py # Structured audit logging with SHA-256 hashing
auth.py # JWT/OIDC authentication
llm_adapter.py # Multi-provider LLM router with circuit breaker
rag.py # ChromaDB RAG with RBAC-filtered retrieval
snowflake_adapter.py
databricks_adapter.py
metrics.py # Prometheus metric definitions
telemetry.py # OpenTelemetry instrumentation
config.py # 60+ env vars, type-safe parsing
frontend/ # React + Vite UI
evals/
runner/ # Eval harness (run_eval, eval_gate, baseline)
datasets/ # Test datasets (initial, red-team, Korean)
reports/ # Generated eval reports and diffs
infra/
k8s/ # Kubernetes manifests (HPA, TLS, AlertManager)
aws/terraform/ # AWS ECS + ALB Terraform module
gcp/terraform/ # GCP Terraform module
docker-compose.yml # Local multi-service orchestration
monitoring/ # Grafana dashboard + AlertManager rules
tests/ # 30+ test files, 84% backend coverage
docs/ # Architecture, ops, blueprint, Datadog, evals docs
scripts/ # Demo runners, quality gates, release ops
- Best fit roles: Solution Architect, Applied AI Engineer, Enterprise AI / Field Engineering
- Strongest public proof: governance pipeline, eval harness, observability surfaces, and deployment-ready backend/frontend split
- What is real here: RBAC, safety pipeline, audit logging, metrics, integration adapters, CI/CD, and deployment scaffolding
- What is bounded here: review cases and documents are synthetic, and Snowflake / Databricks / Bedrock integrations are env-gated
- Verified on: 2026-04-07
- Command:
make verify - Outcome: passed locally; syntax check, dependency check, pytest, smoke diagnostics, and frontend production build completed with 84.20% backend coverage
- Notes:
make verifybootstraps the Python 3.11 backend venv and installs missing frontend dependencies automatically, and Snowflake adapter plus Snowflake-focused service-brief tests were rerun successfully fromapp/backend/.venv
GHCR Docker image published on every push to main. Security scan (pip-audit, bandit, Trivy) runs on schedule. Frontend auto-deploys to Cloudflare Pages.
For governed NL-to-SQL analytics, see Nexus-Hive. For the data pipeline layer, see lakehouse-contract-lab.
MIT