一种轻量化的金字塔卷积
A Light Weight Pyramid Convolution
秦斌斌 1孙金杨2
作者信息
- 1. 浙江经贸职业技术学院信息技术学院,浙江杭州 310018
- 2. 浙江工业职业技术学院设计与艺术学院,浙江绍兴 312000
- 折叠
摘要
金字塔卷积(Pyconv)是近年提出的一种金字塔式多层结构,可以提取多尺度的特征信息,已被应用于多种计算机视觉任务,但其冗余度高,参数量大.因此,本文提出了一种轻量化的金字塔卷积light_Pyconv,其使用卷积分解和分组卷积降低卷积冗余度,同时,将残差单元、通道混洗技术以及注意力机制引入设计,以维持网络的准确率并加速有效特征的提取.在VGG13网络上,参数量从1.96M下降到了0.56M,而在CIFAR-10和CIFAR-100数据集上的准确率仅分别下降了0.87%和0.04%;在ResNet18网络上,参数量从9.22M下降到了7.72M,而在两个数据集上的准确率仅分别下降了0.24%和0.76%.light_Pyconv 在降低模型尺寸的同时,其在收敛速度和准确率波动上的表现仍优于原始网络结构.
Abstract
Pyramid convolution(Pyconv)is a pyramid multi-layer structure proposed in recent years,which can extract multi-scale feature information,and it has been applied to various computer vision tasks.However,it has high redundancy and a large number of parameters.Therefore,this article proposes a lightweight pyramid convolution light-Pyconv that uses convolutional decomposition and group convolution to reduce convolutional redundancy.At the same time,residual units,channel shuffling techniques,and attention mechanisms are introduced into the design to maintain network accuracy and accelerate the extraction of effective features.On the VGG13 network,the number of parameters decreased from 1.96M to 0.56M,while the accuracy on the CIFAR-10 and CIFAR-100 datasets decreased by only 0.87%and 0.04%,respectively;On the ResNet18 network,the number of parameters decreased from 9.22M to 7.72M,while the accuracy on the two datasets only decreased by 0.24%and 0.76%,respectively.Light-Pyconv performs better than the original network structure in terms of convergence speed and accuracy fluctuations while reducing model size.
关键词
金字塔卷积/轻量级的网络/多尺度特征/卷积神经网络/卷积切除/频道的关注Key words
pyramid convolution/light-weight network/multi-scale feature/convolution neural network/depthwise convolution/channel attention引用本文复制引用
出版年
2024