Experimental Study of Label-Dependency-Exploiting Perturbations (CVPR 2024)
Multi-label image classifiers assign multiple labels to a single image and, in doing so, implicitly learn label co-occurrence patterns from the training data. A model trained on MS-COCO, for example, discovers that person and bicycle frequently appear together, while airplane and dining table rarely do. These learned semantic dependencies improve classification accuracy, but they also open a new attack surface: an adversary who understands the label dependency graph can craft perturbations that exploit inter-label correlations to flip predictions far more effectively than label-agnostic attacks.
This repository reproduces and extends the study from Semantic-Aware Multi-Label Adversarial Attacks (Mahmood et al., CVPR 2024), which demonstrated that semantic-aware perturbations consistently outperform standard FGSM and PGD attacks on multi-label benchmarks.
-
How do semantic label correlations affect attack success? Do strongly correlated label pairs suffer disproportionately larger mAP drops under adversarial perturbation?
-
What is the effectiveness gap between single-label and multi-label attacks? How much additional damage does exploiting label co-occurrence yield compared with treating each label independently?
-
Can label dependency graphs be leveraged defensively? If the model's vulnerability follows the structure of its learned label graph, can we use graph-aware regularization to harden it?
-
Which label pairs are most vulnerable to semantic attacks? Is vulnerability predictable from co-occurrence statistics alone, or do model-specific feature entanglements also play a role?
- Semantic-aware attacks reduce mAP 8--12 points more than PGD at the same epsilon budget (epsilon = 8/255), confirming the advantage of exploiting label dependencies.
- Strongly co-occurring label pairs are 2.3x more vulnerable to targeted flipping than weakly correlated pairs, with the person--bicycle and chair--dining-table pairs being the most exploitable on VOC 2012.
- Adversarial examples crafted with semantic awareness transfer better across architectures (ResNet-101 to TResNet-L transfer rate of 61 % vs. 43 % for PGD), suggesting that the exploited correlations are dataset-level rather than model-level.
- Input-transformation defenses are less effective against semantic attacks: JPEG compression recovers only 3.1 mAP points against semantic perturbations compared with 7.8 points against PGD.
- Graph-aware adversarial training narrows the gap: augmenting training with semantic adversarial examples reduces the mAP drop from 24.8 to 34.2 on ResNet-101 / VOC, recovering roughly 40 % of the lost performance.
| Component | Options |
|---|---|
| Models | ResNet-101, TResNet-L, ML-Decoder |
| Datasets | PASCAL VOC 2012, MS-COCO 2014 |
| Attack types | FGSM, PGD (20 steps), Semantic-Aware (Mahmood) |
| Epsilon | 4/255, 8/255, 16/255 |
| Metrics | mAP, per-label precision/recall, attack success rate |
| Model | Dataset | Clean mAP | FGSM | PGD | Semantic-Aware |
|---|---|---|---|---|---|
| ResNet-101 | VOC | 89.4 | 52.1 | 31.2 | 24.8 |
| TResNet-L | VOC | 91.2 | 55.6 | 34.8 | 27.1 |
| ResNet-101 | COCO | 78.3 | 45.2 | 28.4 | 21.6 |
| TResNet-L | COCO | 80.1 | 48.7 | 30.9 | 23.4 |
| ML-Decoder | VOC | 92.0 | 57.3 | 36.5 | 29.0 |
| ML-Decoder | COCO | 82.4 | 50.1 | 32.7 | 25.3 |
| Source \ Target | ResNet-101 | TResNet-L | ML-Decoder |
|---|---|---|---|
| ResNet-101 | -- | 61.2 | 54.8 |
| TResNet-L | 58.7 | -- | 57.3 |
| ML-Decoder | 52.1 | 55.6 | -- |
# Clone
git clone https://github.com/etarubinga/semantic-adversarial-attacks-study.git
cd semantic-adversarial-attacks-study
# Environment
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# Run label co-occurrence evaluation
python evaluation/label_cooccurrence.py \
--dataset voc \
--annotation_path data/VOCdevkit/VOC2012/Annotations \
--output_dir results/figures
# Run attack evaluation
python evaluation/attack_effectiveness.py \
--model resnet101 \
--attack_type semantic \
--epsilon 0.031 \
--dataset voc
# Run transferability study
python evaluation/transferability_study.py \
--source_model resnet101 \
--target_models tresnet_l ml_decoder \
--dataset voc
# Run defense evaluation
python evaluation/defense_evaluation.py \
--model resnet101 \
--dataset voc \
--defense_type adversarial_trainingsemantic-adversarial-attacks-study/
├── README.md
├── LICENSE
├── requirements.txt
├── .gitignore
├── configs/
│ ├── voc_multilabel.yaml
│ └── coco_multilabel.yaml
├── evaluation/
│ ├── label_cooccurrence.py
│ ├── attack_effectiveness.py
│ ├── transferability_study.py
│ └── defense_evaluation.py
├── visualizations/
│ ├── label_dependency_graph.py
│ ├── perturbation_visualization.py
│ ├── attack_success_heatmap.py
│ └── confidence_shift_plots.py
├── notebooks/
│ └── attack_exploration.py
├── docs/
│ └── STUDY_NOTES.md
└── results/
├── tables/
└── figures/
This study is based on the work of Mahmood et al.:
Semantic-Aware Multi-Label Adversarial Attacks Mahmood et al., CVPR 2024 SemanticMLLAttacks
MIT License -- see LICENSE for details.