A Zero Trust Policy Enforcement and Automated Misconfiguration Remediation System for Kubernetes.
Capstone project — Pace University, Seidenberg School of CSIS.
Team: Alister Rodrigues, Pranav Karelia, Siddhant Patel.
A Kubernetes-native operator that continuously audits RBAC and NetworkPolicy configurations against a formally defined Zero Trust baseline. The system detects violations across eight violation types, automatically remediates low-risk misconfigurations (NP-001 LOW, RBAC-001 LOW), escalates high-risk ones for human review, and exposes Prometheus metrics for observability.
All three implementation phases complete. Post-audit hardening applied.
- Phase 1: ZeroTrustPolicy CRD, four detectors (RBAC-001/002/003, NP-001), structured JSON logging, envtest integration tests
- Phase 2: Remediation engine, NP-001 + RBAC-001 autofixes, ConfigMap audit log, time-window rate limiting, dry-run mode
- Phase 3: Prometheus metrics, violation deduplication, event-driven watches, formal evaluation (all 5 metrics measured)
- Post-audit: Four additional detectors (RBAC-004/005/006, NP-002), independent
denyWildcardResourcesCRD field, AuditComplete status conditions, scalability fix in binding lookup, expanded integration tests
See evaluations/results.md for full evaluation results.
| ID | Description | Risk Levels | Autofix? |
|---|---|---|---|
| RBAC-001 | Wildcard verb in ClusterRole | LOW / HIGH / CRITICAL | Yes (LOW only) |
| RBAC-002 | Wildcard resource in ClusterRole | HIGH | No |
| RBAC-003 | cluster-admin bound to non-whitelisted subject | HIGH / CRITICAL | No |
| RBAC-004 | Wildcard verb in namespaced Role | HIGH | No |
| RBAC-005 | Wildcard resource in namespaced Role | HIGH | No |
| RBAC-006 | Non-system ClusterRoleBinding candidate for namespace-scoping | LOW | No |
| NP-001 | Namespace missing default-deny ingress NetworkPolicy | LOW / HIGH / CRITICAL | Yes (LOW only) |
| NP-002 | Namespace missing default-deny egress NetworkPolicy | HIGH / CRITICAL | No |
- Architecture — system design, components, event flow, CRD schema
- Remediation Model — violation types, risk levels, decision matrix, safety mechanisms
- Threat Model — STRIDE analysis, trust boundaries, known limitations
- Evaluation Plan — metrics definitions and test scenarios
- Evaluation Results — measured results for all 5 metrics
- Roadmap — phased deliverables
zerotrust-k8s/
├── api/v1alpha1/ # ZeroTrustPolicy CRD Go types
├── cmd/main.go # Operator entry point
├── config/
│ ├── crd/bases/ # Generated CRD YAML (do not edit)
│ ├── rbac/ # Generated RBAC (do not edit)
│ └── samples/ # cluster-baseline ZeroTrustPolicy CR
├── docs/ # Architecture, threat model, remediation model
├── evaluations/
│ ├── scenarios/ # Evaluation scenario shell scripts
│ └── results.md # Formal evaluation results
├── internal/controller/
│ ├── auditlog.go # ConfigMap audit log writer (batch, key-rollover)
│ ├── decision.go # Remediation decision engine (Decide + decisionFromMatrix)
│ ├── detection.go # All detectors: RBAC-001–006, NP-001–002
│ ├── metrics.go # Prometheus counters and histograms
│ ├── remediation.go # NP-001 and RBAC-001 autofix implementations
│ ├── types.go # ViolationEvent, ViolationKey, AuditEntry types
│ ├── violation_log.go # Structured zerolog violation logging (stdout)
│ └── zerotrustpolicy_controller.go # Main reconcile loop, watches, rate limiting
├── setup.sh # Session setup script (run before make run)
├── Makefile # Build, test, install targets
└── .cursorrules # AI coding assistant context
- Go 1.21+
- kubectl
- minikube
- Docker Desktop (required for minikube Docker driver on Mac M3)
- Kubebuilder CLI v4
- controller-gen
Step 1 — Start minikube (once per machine session):
minikube start --driver=dockerStep 2 — Session setup and start controller:
git pull
./setup.sh
make runsetup.sh creates required namespaces, installs CRDs, applies the cluster-baseline CR, and resets the audit log ConfigMap. make run starts the controller. Wait for the reconcile cycle summary log to settle at new_violations: 0 — that indicates steady state.
Step 3 — Run evaluation scenarios (in a second terminal):
cd evaluations/scenarios
./01-detect-np001.sh # NP-001 detection + remediation latency
./02-detect-rbac001.sh # RBAC-001 detection + remediation latency
./03-false-positive.sh # False positive rate
./04-rate-limit.sh # Rate limit enforcement
./05-availability.sh # Workload availability impactView Prometheus metrics:
curl http://localhost:8080/metrics | grep ztk8sInspect audit log:
kubectl get configmap ztk8s-audit-log -n zerotrust-system -o jsonpath='{.data.audit\.log}' | python3 -m json.tool --no-ensure-ascii 2>/dev/null | head -100Check ZeroTrustPolicy status conditions:
kubectl describe zerotrustpolicy cluster-baseline