动态步长卷积及其层间可解释性方法

扫码查看

原文链接

万方数据
维普

中文摘要：基于卷积神经网络的图像处理方法将卷积步长设置为与输入图像无关的固定值,输入图像的重要区域和不重要区域分配的卷积资源相等,从而导致资源分配不合理和网络冗余.针对该问题,本文提出动态步长卷积(Dynamic Stride Convolution,DSC)方法,通过学习一组与输入数据相关的偏移量来修改卷积核卷积步长,将更多的计算自适应分配给感兴趣区域.此外,本文利用学习到的偏移量来可视化卷积分布,提出层间可解释分析方法,以极低的计算消耗生成直观的可解释图,有助于研究人员分析卷积层之间的注意力分布.为了进一步优化卷积资源分配,本文设计新的损失函数来有效提高DSC的性能并实现对资源位置的编辑,并结合层间可解释分析方法将资源编辑可视化.本文将DSC嵌入到目标检测和图像分割等不同任务中,实验结果表明,在COCO数据集上不同网络的mAP(mean Average-Precision)增加了2%以上,证明了DSC方法的有效性.

外文标题：Dynamic Stride Convolution and Its Inter-Layer Interpretable Method

外文摘要：The image processing schemes based on convolutional neural network sets the convolution step to a fixed value independent of the input image.The equal convolution allocating for both important and unimportant areas of input images leads to unreasonable resource allocation and network redundancy.To address this problem,we propose dynamic stride convolution(DSC),which modifies convolution strides of the convolution kernel by learning a set of offsets related to the input data,and allocates more computations adaptively to the regions of interests.Furthermore,an inter-layer interpreta-ble method is proposed to visualize the convolution distribution using the learned offset,which can generate intuitive inter-pretable diagram with very low computational consumption and help researchers analyze the attention distribution between the convolutional layers.In order to further optimize the convolutional resource allocation,a new loss function is designed to effectively improve the performance of DSC and achieve the editing of resource locations,and the inter-layer interpreta-ble analysis method is combined to visualize resource editing.DSC is embedded into different tasks such as object detection and image segmentation,and experimental results show that the mAP of different networks on the COCO datasets have in-creased by more than two percents generally,which shows the effectiveness of DSC method.

外文关键词：

computer visionconvolution kerneldynamic stride convolutioninter-layer interpretable analysis

作者：

张淑芳、郭子林、丁文鑫、罗曦哲、郭继昌

展开 >

作者单位：

天津大学电气自动化与信息工程学院,天津 300072

关键词：

计算机视觉卷积核动态步长卷积层间可解释分析

基金：

国家自然科学基金

项目编号：

62171315

出版年：

2024

DOI：

10.12263/DZXB.20231039

电子学报

中国电子学会

电子学报

CSTPCD北大核心

影响因子：1.237

ISSN：0372-2112

年,卷(期)：2024.52(10)