Skip to content

lucasdavid/PNOC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

374 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

P-NOC: Adversarial CAM Generation for Weakly Supervised Semantic Segmentation

JVCI arXiv License

Introduction

This respository contains the official implementation for the paper "P-NOC: Adversarial Training of CAM Generating Networks for Robust Weakly Supervised Semantic Segmentation Priors".

Diagram for the proposed training method P-NOC.

In summary, P-NOC is trained by alternatively optimizing two objectives:

$$\begin{align} \mathcal{L}_f &= \mathbb{E}_{(x,y)\sim\mathcal{D},r\sim y}[\mathcal{L}_\text{P} + \lambda_\text{cse}\ell_\text{cls}(p^\text{oc}, y\setminus\{r\})] \\\ \mathcal{L}_\text{noc} &= \mathbb{E}_{(x,y)\sim\mathcal{D},r\sim y}[\lambda_\text{noc}\ell_\text{cls}(p^\text{noc}, y)] \end{align}$$

where $\mathcal{L}_\text{P}$ is the Puzzle-CAM regularization and $p^\text{noc} = oc(x \circ (1 - \psi(A^r) > \delta_\text{noc}))$.

Results

Pascal VOC 2012 (test)

Method bg a.plane bike bird boat bottle bus car cat chair cow d.table dog horse m.bike person p.plant sheep sofa train tv Overall
P-OC 91.6 86.7 38.3 89.3 61.1 74.8 92.0 86.6 89.9 20.5 85.8 57.0 90.2 83.5 83.4 80.8 68.0 87.0 47.1 62.8 43.1 72.4
P-NOC 91.7 87.9 38.1 80.9 66.1 69.8 93.8 86.4 93.2 37.4 83.6 60.9 92.3 84.7 83.8 80.5 62.3 81.9 53.1 77.7 36.7 73.5

MS COCO 2014 (val)

Method bg person bicycle car motorcycle airplane bus train truck boat traffic light fire hydrant stop sign parking meter bench bird cat dog horse sheep cow elephant bear zebra giraffe backpack umbrella handbag tie suitcase frisbee skis snowboard sports ball kite baseball bat baseball glove skateboard surfboard tennis racket bottle wine glass cup fork knife spoon bowl banana apple sandwich orange broccoli carrot hot dog pizza donut cake chair couch potted plant bed dining table toilet tv laptop mouse remote keyboard cell phone microwave oven toaster sink refrigerator book clock vase scissors teddy bear hair drier toothbrush Overall
P-NOC 81.8 55.1 55.3 47.4 70.3 56.3 76.8 68.4 54.6 49.0 46.6 77.4 74.4 71.5 40.4 62.3 76.5 76.1 68.1 75.3 78.5 80.6 85.0 80.7 73.6 28.0 63.3 14.4 15.5 54.1 50.4 8.2 42.7 54.5 46.3 19.1 14.2 26.5 34.9 20.0 40.0 42.7 36.2 23.2 27.8 17.3 16.6 62.9 53.3 46.4 62.1 41.1 28.4 55.1 62.7 66.4 54.3 25.2 34.3 25.4 44.5 13.7 65.1 40.7 55.9 23.2 30.0 60.1 65.5 46.4 36.2 36.5 34.4 27.7 37.9 25.3 35.8 54.1 71.8 29.1 37.3 47.7

Setup

Check the SETUP.md file for information regarding the setup of the Pascal VOC 2012 and MS COCO 2014 datasets.

Experiments

The scripts used for training P-NOC are available in the runners folder. Generally, they will run the following scripts, in this order:

./runners/0-setup.sh
./runners/1-priors.sh
./runners/2-saliency.sh
./runners/3-rw.sh
./runners/4-segmentation.sh

Artifacts and Intermediate Results

Pascal VOC 2012

# Method Description Train set dCRF mIoU Links
CAMs
1 vanilla+ra+ls priors trainaug - 53.7% weights CAMs wdb/train wdb/eval
2 P-OC (OC+ra) priors trainaug - 61.5% weights CAMs wdb/train wdb/eval
3 P-OC+ls (OC+ra) priors trainaug - 61.9% weights CAMs wdb/train wdb/eval
4 P-NOC (OC+ra+ls) priors trainaug - 62.9% weights CAMs wdb/train wdb/eval
5 P-NOC+ls (OC+ra+ls) priors trainaug - 63.7% weights CAMs wdb/train wdb/eval
Saliency
6 C²AM-H (P-NOC+ls #5) saliency trainaug 67.9% weights saliency wdb/train wdb/eval
7 PoolNet (C²AM-H #6) saliency trainaug - 70.8% weights saliency wdb/train wdb/eval
Random Walk
8 AffinityNet (#5, #7) affinity trainaug - masks
9 AffinityNet (#5, #8) pseudo masks trainaug 75.5% weights masks wdb/train wdb/eval
Segmentation
10 DeepLabV3+ (Supervised) segmentation trainaug 80.6% weights masks wdb/train wdb/eval
11 DeepLabV3+ (P-OC #2) segmentation trainaug 71.4% weights masks wdb/train wdb/eval
12 DeepLabV3+ +ls (P-NOC+ls #7) segmentation trainaug 73.8% weights masks wdb/train wdb/eval

MS COCO 2014

# Method Description Train set dCRF mIoU (train) Link
CAMs
1 vanilla+ra priors train - - weights CAMs
2 vanilla+ra+ls priors train - 33.7% weights CAMs wdb/train wdb/eval
3 P-OC (OC+ra #1) priors train - 38.5% weights CAMs wdb/train wdb/eval
4 P-OC+ls (OC+ra+ls #2) priors train - 37.3% weights CAMs wdb/train wdb/eval
5 P-NOC (OC+ra #1) priors train - 40.7% weights CAMs wdb/train wdb/eval
6 P-NOC+ls (OC: RS269+ra) priors train - 38.2% weights CAMs wdb/train wdb/eval
Saliency
6 C²AM-H (P-NOC #5) saliency trainaug 70.5% weights saliency wdb/train wdb/eval
7 PoolNet (C²AM-H #7) saliency trainaug - 71.3% weights saliency wdb/train wdb/eval
Random Walk
8 AffinityNet (#5, #7) affinity train - masks
9 AffinityNet (#5, #7, #8) pseudo masks train 47.7% weights masks wdb/train wdb/eval
Segmentation
2 DeepLabV3+ (P-NOC #2) segmentation train - 44.6% weights masks wdb/train wdb/eval

Citation

If our work was helpful to you, please cite it as:

@article{david2024104187pnoc,
title = {P-NOC: Adversarial training of CAM generating networks for robust weakly supervised semantic segmentation priors},
journal = {Journal of Visual Communication and Image Representation},
volume = {102},
pages = {104187},
year = {2024},
issn = {1047-3203},
doi = {https://doi.org/10.1016/j.jvcir.2024.104187},
author = {Lucas David and Helio Pedrini and Zanoni Dias}

Acknowledgements

Much of the code here was borrowed from jiwoon-ahn/psa, KAIST-vilab/OC-CSE, shjo-april/PuzzleCAM and CVI-SZU/CCAM repositories. We thank the authors for their considerable contributions and efforts.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors