Delving deep into spatial pooling for squeeze-and-excitation networks

扫码查看

原文链接

NSTL
Elsevier

外文摘要：Squeeze-and-Excitation (SE) blocks have demonstrated significant accuracy gains for state-of-the-art deep architectures by re-weighting channel-wise feature responses. The SE block is an architecture unit that integrates two operations: a squeeze operation that employs global average pooling to aggregate spatial convolutional features into a channel feature, and an excitation operation that learns instance-specific channel weights from the squeezed feature to re-weight each channel. In this paper, we revisit the squeeze operation in SE blocks, and shed lights on why and how to embed rich (both global and lo -cal ) information into the excitation module at minimal extra costs. In particular, we introduce a simple but effective two-stage spatial pooling process: rich descriptor extraction and information fusion . The rich descriptor extraction step aims to obtain a set of diverse (i.e., global and especially local) deep descrip-tors that contain more informative cues than global average-pooling. While, absorbing more information delivered by these descriptors via a fusion step can aid the excitation operation to return more accu-rate re-weight scores in a data-driven manner. We validate the effectiveness of our method by extensive experiments on ImageNet for image classification and on MS-COCO for object detection and instance seg-mentation. For these experiments, our method achieves consistent improvements over the SENets on all tasks, in some cases, by a large margin. (c) 2021 Published by Elsevier Ltd.

外文关键词：

Convolutional neural networksSqueeze-and-excitationSpatial poolingBase modelIMAGECLASSIFICATIONATTENTION

作者：

Jin, Xin、Xie, Yanping、Wei, Xiu-Shen、Zhao, Bo-Rui、Chen, Zhao-Min、Tan, Xiaoyang

展开 >

作者单位：

Megvii Technol

Nanjing Univ Sci & Technol

Nanjing Univ Aeronaut & Astronaut

出版年：

2022

DOI：

10.1016/j.patcog.2021.108159

Pattern Recognition

EISCI

ISSN：0031-3203

年,卷(期)：2022.121

被引量25
参考文献量54