Skip to content

Sherry1945/GCP_OPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 

Repository files navigation

Towards A Deeper Understanding of Global Covariance Pooling in Deep Learning: An Optimization Perspective

This is an project code of TPAMI2023 paper (Towards A Deeper Understanding of Global Covariance Pooling in Deep Learning: An Optimization Perspective ), created by Qilong Wang, Zhaolin Zhang*, Mingze Gao*. This paper integrates and substantially extends our preliminary works published in CVPR 2020 and NeurIPS 2022, whose titles are ``What Deep CNNs Benefit from Global Covariance Pooling: An Optimization Perspective'' and ''DropCov: A Simple yet Effective Method for Improving Deep Architectures'', respectively.

Contents

Citation

@ARTICLE{wang2023towards,
  author={Wang, Qilong and Zhang, Zhaolin and Gao, Mingze and Xie, Jiangtao and Zhu, Pengfei and Li, Peihua and Zuo, Wangmeng and Hu, Qinghua},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={Towards a Deeper Understanding of Global Covariance Pooling in Deep Learning: An Optimization Perspective}, 
  year={2023},
  volume={45},
  number={12},
  pages={15802-15819},
}

Abstract

Global covariance pooling (GCP) as an effective alternative to global average pooling has shown good capacity to improve deep convolutional neural networks (CNNs) in a variety of vision tasks. Although promising performance, it is still an open problem on how GCP (especially its post-normalization) works in deep learning. In this paper, we make the effort towards understanding the effect of GCP on deep learning from an optimization perspective. Specifically, we first analyze behavior of GCP with matrix power normalization on optimization loss and gradient computation of deep architectures. Our findings show that GCP can improve Lipschitzness of optimization loss and achieve flatter local minima, while improving gradient predictiveness and functioning as a special pre-conditioner on gradients. Then, we explore the effect of post-normalization on GCP from the model optimization perspective, which encourages us to propose a simple yet effective normalization, namely DropCov. Based on above findings, we point out several merits of deep GCP that have not been recognized previously or fully explored, including faster convergence, stronger model robustness and better generalization across tasks. Extensive experimental results using both CNNs and vision transformers on diversified vision tasks provide strong support to our findings while verifying the effectiveness of our method.

References

DropCov: A Simple yet Effective Method for Improving Deep Architectures Qilong Wang, Mingze Gao, Zhaolin Zhang, Jiangtao Xie, Peihua Li, Qinghua Hu. Advances in Neural Information Processing Systems (NeurIPS), 2022.

What Deep CNNs Benefit from Global Covariance Pooling: An Optimization Perspective Qilong Wang, Li Zhang, Banggu Wu, Dongwei Ren, Peihua Li, Wangmeng Zuo, Qinghua Hu. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),2020.

Introduciton

In this paper, we make an attempt to understand what deep CNNs benefit from GCP in a viewpoint of optimization. Specifically, we explore the effect of GCP on deep CNNs in terms of the Lipschitzness of optimization loss and the predictiveness of gradients, and show that GCP can make the optimization landscape more smooth and the gradients more predictive. Furthermore, we discuss the connection between GCP and second-order optimization for deep CNNs. More importantly, above findings can account for several merits of covariance pooling for training deep CNNs that have not been recognized previously or fully explored, including significant acceleration of network convergence (i.e., the networks trained with GCP can support rapid decay of learning rates, achieving favorable performance while significantly reducing number of training epochs), stronger robustness to distorted examples generated by image corruptions and perturbations, and good generalization ability to different vision tasks, e.g., object detection and instance segmentation. Image

This work first analyzes the effect of post-normalization from the perspective of training GCP networks. Particularly, we for the first time show that effective post-normalization can make a good trade-off between representation decorrelation and information preservation for GCP, which are crucial to alleviate over-fitting and increase representation ability of deep GCP networks, respectively. Based on this finding, we can improve existing postnormalization methods with some small modifications, providing further support to our observation. Furthermore, this finding encourages us to propose a novel pre-normalization method for GCP (namely DropCov), which develops an adaptive channel dropout on features right before GCP, aiming to reach trade-off between representation decorrelation and information preservation in a more efficient way. Our DropCov only has a linear complexity of O(d), while being free for inference. Image

Results

Main Results on ImageNet with Pretrained Models

Works Paper Mindspore Pytorch
What Deep CNNs Benefit from Global Covariance Pooling: An Optimization Perspective Link Link Link
DropCov: A Simple yet Effective Method for Improving Deep Architectures Link Link Link

Results with Mindspore

Method Top-1 Acc.(%) Params.(M) FLOPs(G) Checkpoint
ResNet-18 70.53 11.7 1.81
ResNet-18+Fast-MPN(Ours) 74.64 19.6 3.11 Download
ResNet-18+Drop-COV(Ours) 73.8 19.3 3.11 Download
ResNet-34 73.68 21.8 3.66
ResNet-34+Fast-MPN(Ours) 76.27 29.7 5.56 Download
ResNet-34+Drop-COV(Ours) 76.13 29.6 5.56 Download
ResNet-50 76.07 25.6 3.86
ResNet-50+Fast-MPN(Ours) 77.71 32.3 6.19 Download
ResNet-50+Drop-COV(Ours) 77.77 32.0 6.19 Download

Results with Pytorch

Method Top-1 Acc.(%) Params.(M) FLOPs(G) Checkpoint
ResNet-34 74.19 21.8 3.66
ResNet-34+Fast-MPN(Ours) 76.80 29.7 5.56 Download
ResNet-34+Drop-COV(Ours) 76.81 29.6 5.56 Download
ResNet-50 76.02 25.6 3.86
ResNet-50+Fast-MPN(Ours) 78.56 32.3 6.19 Download
ResNet-50+Drop-COV(Ours) 78.19 32.0 6.19 Download
ResNet-101 77.67 44.6 7.57
ResNet-101+Fast-MPN(Ours) 79.47 51.3 9.90 Download
ResNet-101+Drop-COV(Ours) 79.51 51.0 9.90 Download

About

This is an project code of TPAMI2023 paper (Towards A Deeper Understanding of Global Covariance Pooling in Deep Learning: An Optimization Perspective ), created by Qilong Wang, Zhaolin Zhang*, Mingze Gao*.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors