Joint Optimization of Fairness, Privacy, Explainability & Accountability in AI-Based Cybersecurity
EAGF is a reproducible research framework that combines differential privacy (DP-SGD), fairness regularization (false positive rate parity), explainability (SHAP), and audit logging into a unified governance pipeline. Evaluated on the real-world Edge-IIoTset (IEEE 2022) for IoT anomaly detection with multi-objective Pareto trade-off analysis.
Key Focus: Cybersecurity for IoT networks with resource-constrained devices. EAGF enables governance-aware AI deployment with minimal system overhead.
- π Real-world dataset integration: Edge-IIoTset (IEEE 2022, 150K samples, 40 network flow features)
- βοΈ Multi-objective governance: Joint optimization of four pillarsβfairness, privacy, clarity, accountability
- π Trust Index metric: Composite governance score for model selection and comparison
- π Pareto trade-off analysis: Quantify accuracy-fairness-privacy trade-offs with front visualization
- π Reproducible pipeline: Deterministic execution, fixed seeds, publication-ready results
Dataset: Edge-IIoTset (150K samples, 3 protocol-type protected groups)
Baselines: Unregulated + Joint DP+Fair
Statistical Rigor: 5 independent seeds with 95% CI
| Metric | Baseline | EAGF | Ξ | Improvement |
|---|---|---|---|---|
| Accuracy | 0.6481 Β± 0.0251 | 0.6650 Β± 0.0079 | +0.0168 | +2.6% |
| FPR Parity | 0.4931 Β± 0.0849 | 0.7709 Β± 0.0573 | +0.2779 | +56.4% β |
| Clarity | 0.6918 Β± 0.0432 | 0.7390 Β± 0.0548 | +0.0472 | +6.8% |
| Privacy | 0.2475 Β± 0.0030 | 0.2482 Β± 0.0025 | +0.0007 | Preserved β |
| Accountability | 0.0000 Β± 0.0000 | 0.6667 Β± 0.0000 | +0.6667 | Full coverage β |
| Trust Index | 0.3581 Β± 0.0129 | 0.6062 Β± 0.0108 | +0.2481 | +69.3% β |
β
Fairness breakthrough: FPR parity improved +56.4% across protocol-type groups (web, IoT MQTT, misc)
β
Trust Index surge: Composite governance metric +69.3%, indicating strong multi-objective alignment
β
Privacy guarantee: Differential privacy (Ξ΅=2.4) maintained with negligible DP-Ξ΅ change
β
Edge deployment ready: +0.2ms latency (~11%), +5.8MB memoryβsuitable for constrained IoT
β
Calibration stable: ECE and Brier comparable (Β±0.05), no metric gaming
eagf/
βββ π README.md # This file
βββ π requirements.txt # Dependencies (numpy, pandas, scikit-learn, fairlearn, pyyaml)
βββ π§ setup.py # Package setup
β
βββ π run_eagf.py # Single-seed entry point
βββ π run_full_pipeline.py # Multi-seed experiment runner (MAIN)
β
βββ π§ src/
β βββ training/
β β βββ eagf_trainer.py # Main EAGF training loop with governance
β β βββ fairness_loss.py # Fairness penalty (FPR parity)
β β βββ pareto_trainer.py # Pareto front exploration
β β
β βββ evaluation/
β β βββ baseline.py # Unregulated baseline + Joint DP+Fair
β β βββ ablation.py # Single-pillar ablation study
β β βββ report_generator.py # Multi-seed report + statistics
β β βββ audit_logger.py # Compliance audit trail
β β βββ benchmark_suite.py # System metrics (latency, memory, energy)
β β βββ statistics.py # 95% CI, statistical tests
β β
β βββ metrics/
β β βββ fairness.py # FPR parity, recall parity, group metrics
β β βββ privacy.py # DP-SGD evaluation, privacy accounting
β β βββ clarity.py # SHAP-based explainability
β β βββ accountability.py # Audit coverage, compliance scoring
β β βββ trust_index.py # Composite Trust Index aggregation
β β
β βββ utils/
β β βββ data_loader.py # Generic dataset loading
β β βββ edge_iiot_loader.py # Edge-IIoTset specific (protocol_type grouping)
β β βββ real_data_loader.py # Real dataset pipeline
β β βββ preprocessing.py # Feature engineering, normalization
β β βββ reiot_simulator.py # RE-IoT synthetic simulator (optional)
β β βββ ahp.py # Analytic Hierarchy Process (Trust Index weights)
β β βββ visualisation.py # Pareto, Trade-off plots
β β
β βββ baselines/
β βββ aif360_dp_pipeline.py # AIF360 fairness baseline
β βββ joint_dp_fair_baseline.py # Combined DP + fairness baseline
β
βββ βοΈ configs/
β βββ reiot_real.yaml # Main: Edge-IIoTset + EAGF governance (RECOMMENDED)
β βββ reiot_default.yaml # Alternative RE-IoT config
β βββ biometric_default.yaml # Biometric (secondary validation)
β βββ biometric_tuned_auto.yaml # Tuned biometric
β βββ eagf_thresholds.yaml # Governance thresholds
β βββ compliance_checklist*.yaml # Compliance templates
β
βββ π data/
β βββ README.md # Data documentation
β βββ real_iot/
β βββ edge_iiot.csv # Edge-IIoTset (150K rows, 40 features) [USER PROVIDED]
β
βββ π notebooks/
β βββ 01_eagf_demo.ipynb # Quick start demo
β βββ 02_statistical_analysis.ipynb # Multi-seed statistics
β βββ 03_reiot_fairness.ipynb # Fairness deep-dive (protocol_type groups)
β βββ 04_pareto_front.ipynb # Pareto front visualization
β βββ 05_trust_index_sensitivity.ipynb # Sensitivity analysis
β
βββ π¨ figures/
β βββ pareto_front.png # Accuracy vs. Fairness vs. Privacy
β βββ ti_vs_latency.png # Trust Index vs. Inference Latency
β βββ ablation_comparison.png # Single-pillar vs. multi-pillar
β
βββ π docs/
β βββ metric_definitions.md # Detailed metric documentation
β βββ regulatory_mapping.md # Compliance + GDPR/CCPA alignment
β βββ reproducibility.md # Detailed reproducibility steps
β
βββ π§ͺ results/
β βββ final_report.txt # β¨ MAIN DELIVERABLEβaggregated results
β βββ main_results.csv # Baseline, EAGF, Joint metrics (5 seeds)
β βββ pareto_results.csv # Pareto exploration (25 runs)
β βββ [seed-specific subdirs]/ # Individual seed outputs
β
βββ π οΈ scripts/
β βββ run_all.sh # Full pipeline (Edge-IIoT + ablation)
β βββ run_reiot.sh # Edge-IIoTset only
β βββ run_baseline.sh # Baseline only
β βββ run_pareto_search.sh # Pareto front exploration
β βββ sweep_three_stage.py # Hyperparameter sweep
β βββ verify_metrics.py # Metric validation
β
βββ β
tests/
β βββ test_data.py # Data loading & preprocessing
β βββ test_metrics.py # Metric computation
β βββ conftest.py # Pytest fixtures
β βββ run_tests.py # Test runner
β
βββ π³ Dockerfile # Container setup
βββ π environment.yml # Conda environment (optional)
βββ π CONTRIBUTING.md # Contribution guidelines
βββ π LICENSE # MIT License
βββ π CHANGELOG.md # Version history
βββ π CITATION.cff # BibTeX citation metadata
Multi-objective optimization frontier showing EAGF solutions (orange) vs. baseline (blue) across accuracy-fairness-privacy space.
System efficiency comparison: EAGF maintains high governance (TI=0.61) with minimal latency overhead (+0.2ms/sample).
Impact of each governance pillar on Trust Index. Multi-pillar integration significantly outperforms single-pillar approaches.
Run notebooks directly in Google Colab without local setup:
- Python 3.9+
- pip
git clone https://github.com/aliakarma/eagf.git
cd eagf
python -m venv .venv
# Windows
.venv\Scripts\activate
# Linux / macOS
source .venv/bin/activate
pip install -r requirements.txtDataset: ML-EdgeIIoT-dataset.csv (real-world IoT anomaly detection)
Source: IEEE Access 2022 (Ferrag et al.)
Size: ~78 MB (157.8K raw rows)
Features: 40 network flow + protocol-specific attributes
Labels: Normal vs. Attack (imbalanced: 23.1K vs. 126.9K)
Note: EAGF uses real Edge-IIoTset data only. No synthetic fallback.
Option A: Manual Download (Recommended)
- Download
ML-EdgeIIoT-dataset.csvfrom IEEE DataPort - Extract and place at:
data/real_iot/edge_iiot.csv
Option B: Verify Existing Data
If you already have the dataset:
# Check file size (~78 MB)
ls -lh data/real_iot/edge_iiot.csv| File | Description | Format |
|---|---|---|
| results/final_report.txt | β¨ Main deliverableβaggregated metrics, statistics, validation gates | TXT |
| results/main_results.csv | Summary table: baseline, EAGF, Joint DP+Fair across all metrics | CSV |
| results/pareto_results.csv | Pareto search results (25 multi-objective runs with trade-off scores) | CSV |
| figures/pareto_front.png | 3D visualization: accuracy vs. fairness vs. privacy | PNG |
| figures/ti_vs_latency.png | 2D scatter: Trust Index vs. inference latency | PNG |
| figures/ablation_comparison.png | Bar chart: pillar ablation study (single vs. multi-pillar) | PNG |
| [seed-specific]/predictions.json | Per-sample predictions, confidences, fairness group info | JSON |
| [seed-specific]/metrics.json | Detailed metrics for each seed | JSON |
Validate the entire pipeline in ~5 minutes using a single seed:
python run_full_pipeline.py \
--real_dataset edge_iiot \
--config configs/reiot_real.yaml \
--seeds 42 \
--fastWhat happens:
- β Loads Edge-IIoTset and validates data
- β Trains baseline + EAGF + Joint DP+Fair models (1 seed)
- β Computes fairness, privacy, clarity, accountability metrics
- β Generates
results/final_report.txtwith summary - β Produces figures in
figures/
Expected runtime: ~5 min on CPU (Intel Core i7)
Reproduce final results with 5 independent seeds (statistical rigor):
python run_full_pipeline.py \
--real_dataset edge_iiot \
--config configs/reiot_real.yaml \
--seeds 42 43 44 45 46Output:
- Mean Β± std for all governance metrics
- 95% confidence intervals
- Pareto front visualization (25 multi-objective runs)
- Final report with ablation analysis
- Seed-specific detailed logs
Expected runtime: ~10β15 min on CPU
Explore the full accuracy-fairness-privacy trade-off surface:
python -c "
from src.training.pareto_trainer import ParetoTrainer
from src.utils.edge_iiot_loader import EdgeIIoTLoader
loader = EdgeIIoTLoader('data/real_iot/edge_iiot.csv')
X_train, X_test, y_train, y_test, groups = loader.load()
trainer = ParetoTrainer(X_train, y_train, groups, seed=42)
trainer.search(n_objectives=3, n_runs=25) # Explore 25 configurations
trainer.plot_pareto('figures/pareto_custom.png')
"Dataset: Edge-IIoTset (150K samples, protocol-type protected groups)
Baselines: Unregulated model, Joint DP+Fair
Seeds: 5 independent runs
Metrics: Accuracy, Fairness (FPR Parity), Clarity, Privacy, Accountability, Trust Index
| Metric | Baseline | EAGF | Ξ |
|---|---|---|---|
| Accuracy | 0.6481 Β± 0.0251 | 0.6650 Β± 0.0079 | +0.0168 (+2.6%) |
| FPR Parity | 0.4931 Β± 0.0849 | 0.7709 Β± 0.0573 | +0.2779 (+56.4%) |
| Clarity | 0.6918 Β± 0.0432 | 0.7390 Β± 0.0548 | +0.0472 (+6.8%) |
| Privacy | 0.2475 Β± 0.0030 | 0.2482 Β± 0.0025 | +0.0007 (+0.3%, preserved) |
| Accountability | 0.0000 Β± 0.0000 | 0.6667 Β± 0.0000 | +0.6667 β |
| Trust Index | 0.3581 Β± 0.0129 | 0.6062 Β± 0.0108 | +0.2481 (+69.3%) |
- Fairness via FPR Parity: EAGF achieves +56.4% improvement in false positive rate fairness across protocol-type groups (web, IoT MQTT, misc). Disparities in false alarm rates reduced from 49.3% to 23% spread.
- Trust Index: Composite governance metric improves by +69.3%, indicating strong multi-objective alignment.
- Privacy Preserved: Differential privacy (Ξ΅=2.4) maintained with negligible change vs. baseline. No privacy regression.
- Minimal System Overhead: Inference latency +0.2ms/sample (~11% increase); memory +5.8 MB. Suitable for edge deployment.
- Calibration Stability: ECE and Brier scores comparable (within Β±0.05), no metric gaming.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β EAGF Framework β
βββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββββββββββββββββ€
β Clarity β Fairness β Privacy β Accountability β
βββββββββββΌβββββββββββΌβββββββββββΌβββββββββββββββββββββββββ€
β SHAP β FPRP β DP-SGD β Audit Logging + Rules β
β Loss β Loss β Gradient β Compliance Coverage β
βββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββββββββββββββββ
β
Trust Index (TI)
Weighted Aggregation via AHP
| Pillar | Metric | Implementation | Config |
|---|---|---|---|
| Fairness | FPR Parity | src/metrics/fairness.py |
lambda_rp: 0.2 |
| Privacy | DP Accounting | src/metrics/privacy.py |
dp_epsilon: 2.4 |
| Clarity | SHAP Sparsity | src/metrics/clarity.py |
lambda_c: 0.05 |
| Accountability | Audit Coverage | src/metrics/accountability.py |
Compliance rules |
- Fixed seeds: 42β46 (5 independent runs)
- Deterministic pipeline: Reproducible to Β±0.005 variance (NumPy/PyTorch seeds)
- No hidden preprocessing: All transformations logged in
audit_logger.py - Hyperparameter justification: See configs/reiot_real.yaml
All experiments must satisfy:
- β FPR Parity EAGF β₯ Baseline + 0.02 (fairness improvement)
- β Privacy EAGF β₯ Baseline (no regression)
- β Accuracy drop β€ 2% (stability requirement)
- β Trust Index EAGF > Baseline (overall improvement)
| File | Purpose |
|---|---|
| docs/metric_definitions.md | Detailed fairness, privacy, clarity, accountability metrics |
| docs/regulatory_mapping.md | GDPR, CCPA, ISO alignment |
| docs/reproducibility.md | Step-by-step reproducibility guide |
- Fairness: Hardt et al. (2016), Moritz et al. (2020)
- Privacy: Abadi et al. (2016) DP-SGD, Kairouz et al. (2021) DP survey
- Explainability: Lundberg & Lee (2017) SHAP
- IoT Security: Ferrag et al. (2022) Edge-IIoTset
MIT License β See LICENSE for full terms.
Contributions welcome! See CONTRIBUTING.md for guidelines.
For reproducibility help, issues, or questions:
- Check diagnostics: See
results/final_report.txtfor detailed error logs - Verify dataset: Ensure
data/real_iot/edge_iiot.csvexists (~78 MB) - Check environment:
python -m pip show scikit-learn fairlearn numpy pandas
- Open issue: Include OS, Python version, full error trace, and reproducibility steps
Last Updated: March 2026 | Python 3.9+ | PyTorch 2.0+