IoUT Interrogator Framework is a trust-aware IoUT anomaly inference pipeline with leakage-safe evaluation, class-imbalance controls, and deterministic multi-seed reporting on both synthetic and real network telemetry.
- Highlights
- Visual Overview
- Datasets
- Experimental Setup
- Final Results
- Reproducibility (Strict, Copy-Paste)
- Installation
- Expected Outputs
- Configuration
- Repository Structure
- Citation
- License
- Leakage-safe protocol: split, scale, and threshold calibration are strictly train/validation scoped.
- Real-data robustness: UNSW-NB15 class-imbalance handling with weighted loss, weighted sampling, and balanced-recall thresholding.
- Reproducible statistics: 20-seed evaluation with mean and standard deviation reporting.
- Reviewer-ready outputs: final summary tables, split checks, confusion matrix, and publication-style report artifacts.
flowchart LR
A[Raw Data] --> B[Leakage-Safe Split\nTrain / Val / Test]
B --> C[Train-Only Normalization]
C --> D[Model Training\nWeighted Loss + Weighted Sampler]
D --> E[Validation Threshold Sweep\nBalanced Recall Objective]
E --> F[Test Evaluation]
F --> G[20-Seed Aggregation\nFinal Metrics + Reports]
- Purpose: controlled benchmarking across architectures and baselines.
- Pipeline target: multi-model 20-seed robustness summary.
The dataset is not included in this repository due to licensing restrictions.
Please download it from the official source: https://research.unsw.edu.au/projects/unsw-nb15-dataset
After downloading, place the files in:
data/raw/unsw_nb15/
Expected files:
UNSW_NB15_training-set.csvUNSW_NB15_testing-set.csv
- Seeds: 42-61 (20 runs)
- All experiments use fixed seeds (42-61) for reproducibility.
- Splits: train 70%, validation 15%, test 15%
- Threshold tuning: validation-only sweep over 0.45 to 0.75 using balanced recall
- Imbalance controls:
- alpha-scaled BCEWithLogitsLoss (alpha = 0.7)
- weighted sampler with inverse-frequency exponent
- Key constraints enforced:
- no test-time tuning
- no leakage
- no synthetic oversampling on real data
Only final artifacts are reported below.
Source: results/synthetic_final/summary.csv
| Model | F1 (mean +/- std) | ROC-AUC (mean +/- std) | PR-AUC (mean +/- std) | Balanced Accuracy (mean +/- std) |
|---|---|---|---|---|
| hybrid_temporal | 0.7851 +/- 0.0594 | 0.9637 +/- 0.0089 | 0.8851 +/- 0.0199 | 0.8514 +/- 0.0437 |
| random_forest | 0.7081 +/- 0.0404 | 0.9003 +/- 0.0124 | 0.8088 +/- 0.0222 | 0.7870 +/- 0.0257 |
| logistic_regression | 0.6667 +/- 0.0000 | 0.8638 +/- 0.0000 | 0.7572 +/- 0.0000 | 0.7758 +/- 0.0000 |
| lstm | 0.6444 +/- 0.0403 | 0.8199 +/- 0.0422 | 0.6973 +/- 0.0345 | 0.7513 +/- 0.0216 |
Source: results/unsw_final_balanced/summary.csv
| Metric | Mean +/- Std |
|---|---|
| F1 | 0.8910 +/- 0.0026 |
| ROC-AUC | 0.9251 +/- 0.0199 |
| PR-AUC | 0.9254 +/- 0.0298 |
| Balanced Accuracy | 0.8397 +/- 0.0041 |
| Recall (Class 0) | 0.6961 +/- 0.0075 |
| Recall (Class 1) | 0.9834 +/- 0.0022 |
- OS: Linux, macOS, or Windows (PowerShell supported)
- Python: 3.10+
- Optional: CUDA-enabled GPU (training also works on CPU)
git clone https://github.com/aliakarma/IoUT-Interrogator-Framework.git
cd IoUT-Interrogator-Framework
python -m venv .venvWindows PowerShell:
.\.venv\Scripts\Activate.ps1
pip install --upgrade pip
pip install -r requirements.txtLinux/macOS:
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txtPlace UNSW-NB15 CSV files in:
data/raw/unsw_nb15/Training and Testing Sets/
python scripts/run_multi_seed_experiments.py \
--dataset synthetic \
--seeds 42-61python run_unsw_publication_pipeline.py --seeds 42-61Quick verification (one-line):
python run_unsw_publication_pipeline.py --quick-testpython -c "import pandas as pd; print(pd.read_csv('results/synthetic_final/summary.csv'))"
python -c "import pandas as pd; print(pd.read_csv('results/unsw_final_balanced/summary.csv'))"
python -c "import json; print(json.load(open('results/unsw_final_balanced/validation_checks.json')))"Dependency Notes
- Core stack: PyTorch, NumPy, pandas, scikit-learn, SciPy, matplotlib.
- Install from:
requirements.txt
- For CUDA, install the CUDA-compatible PyTorch build for your platform, then run the same commands above.
results/
synthetic_final/
unsw_final_balanced/
Primary entry points:
run_pipeline.pyrun_unsw_publication_pipeline.py
- Primary config file:
configs/default.yaml - Main configurable groups:
data: dataset source/path, split strategy, loader settingsmodel: architecture type and dimensionstraining: epochs, learning rate, loss settings, seedevaluation: threshold, tuning metric, confusion-matrix export
IoUT-Interrogator-Framework/
├── configs/ # Experiment and model configuration files
├── data/ # Data loaders, adapters, and dataset docs
├── docs/ # Methodology, changelog, and reproducibility notes
├── scripts/ # Reproducible experiment entry points
├── results/
│ ├── synthetic_final/ # Final synthetic benchmark outputs
│ └── unsw_final_balanced/ # Final real-data (UNSW) outputs
├── models/ # Model architecture implementations
├── training/ # Training loop and optimization logic
├── evaluation/ # Metrics, threshold tuning, evaluation flow
├── simulation/ # Simulation utilities and configs
├── blockchain/ # Optional blockchain integration components
├── tests/ # Automated validation tests
├── run_pipeline.py # Main pipeline entry point
└── run_unsw_publication_pipeline.py # Real-data publication pipeline entry
Use CITATION.cff when available, or the placeholder below:
@misc{iout_interrogator_framework,
title = {IoUT Interrogator Framework: Trust-Aware IoUT Anomaly Inference},
author = {Akarma, Ali and contributors},
year = {2026},
howpublished = {GitHub repository},
note = {Reproducible 20-seed synthetic and UNSW-NB15 evaluations}
}This project is released under the MIT License. See LICENSE.