引入余弦空间相关性的两阶段滤波器剪枝

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：目的深度神经网络在图形图像、计算机视觉等众多应用领域取得了令人瞩目的效果，但是一直以来深度学习网络模型由于其庞大的计算量以及存储资源而无法部署在资源受限的嵌入式设备端。为了解决模型所需的计算资源和嵌入式设备资源受限之间的矛盾，提出了一种引入余弦空间相关的两阶段滤波器剪枝方法，旨在利用滤波器间的空间相关性实现更优的剪枝方式。方法在预剪枝阶段引入L范数记录下范数值最高的滤波器，本文称为关键滤波器;在剪枝阶段引入余弦距离保留和关键滤波器空间相关性高的滤波器。结果本文提出的剪枝方法在CIFAR(Canadian Institute for Advanced Research)数据集上取得了优于其他对比方法的效果，在CIFAR10数据集上将VGG(Visual Geometry Group)16的参数量和浮点运算量分别压缩了 72。9％和73。5％，同时模型精度提升了 0。1％。对于高效的残差网络ResNet(residual neural network)56和深度可分离网络MobileNet VI也可以有效地压缩，该方法在CIFAR100数据集上对ResNet56网络在更高的压缩率下实现了更小的精度损失(精度提升0。48％)。对于MobileNet V1网络，压缩了46。89％的参数量和46。23％的浮点运算量，而模型精度提升了0。11％。结论引入余弦空间相关性的两阶段滤波器剪枝策略避免了网络剪枝中"衡量指标小，则衡量对象不重要"和"相似即冗余"两种假设不成立而导致模型陷入次优结果，从滤波器空间的角度挖掘相关性，在保证模型准确率的前提下能够压缩更多的参数量和浮点运算量。

外文标题：Two-stage filter pruning incorporating cosinespatial correlation

外文摘要：Objective Convolutional neural networks have made breakthroughs in computer vision,speech recognition,and other fields.However,with the continuous pursuit for neural network models with excellent performance,the structure of these models has become increasingly complex as mainly reflected in the width and number of their layers.Accordingly,the size and computing resource requirements of these models are also constantly expanding,and such a huge resource con-sumption limits these models to server platforms with unlimited computing power and other resources.As deep learning net-works gradually integrate into the application end devices,many network models cannot be deployed on resource-constrained embedded end devices,such as smartphones,low-end mainboards,and edge devices.To address the contra-diction between the computing resource requirements of network models and resource-constrained embedded devices,the existing complex models should be compressed.Based on the extant model pruning methods,this article proposes a two-stage filter pruning method that incorporates cosine spatial correlation(CSCTFP),which improves pruning performance by utilizing the spatial correlation between filters.CSCTFP also relies on such spatial correlation to identify the filter bank that contributes the most to the network,thus avoiding the secondary model pruning results caused by the assumption that"if the measurement index is small,the measurement object is not important".Method The existing model pruning methods are mainly divided into two types.The first type,called unstructured pruning,uses the weight parameters of the filter as the minimum pruning unit.However,this pruning method leads to the unstructured sparsity of the filter.The network structure after pruning cannot use the existing software and hardware to achieve an acceleration effect but needs to design a corresponding accelerator to speed up the calculation of unstructured sparse matrix.The second type,called structured pruning,takes the whole filter as the smallest pruning unit.This pruning method makes the network structure appear struc-tured and sparse,thus facilitating the use of existing software and hardware for acceleration.The existing filter pruning methods mainly use the assumption that"if the measurement index is small,the measurement object is not important"as an important evaluation criterion for filters,such as using the kernel norm of the filter as the measurement importance index.Alternatively,the"similarity is redundancy"assumption can be used as a criterion for evaluating filter redundancy,such as using the distance between filters as a measure of redundancy.The above two assumptions need to meet the prerequisite conditions,and they do not always hold true in actual scenarios.CSCTFP aims to address these shortcomings as follows.First,in the pre-pruning stage of the model,instead of deleting small norm filters,CSCTFP identifies the filter represented by the maximum norm value,which is referred to as the key filter in this article.Second,in the pruning stage,a set of fil-ters that are highly correlated with the key filters is preserved by computing the cosine distance.Measuring the correlation between filters in these two stages also avoids poor pruning results when the above two assumptions do not hold.Result Experiments were conducted using various network structures,such as visual geometry group(VGG)16,residual neural network(ResNet)56,and MobileNet V1,to verify that the proposed method can be adapted to different types of network models with sequential,residual,and deeply separable structures.The experimental results on datasets CIFAR10 and CIFAR100 were compared with those of previous methods.On the CIFAR10 dataset,the parameter count and floating-point operations(FLOPs)of VGG16 were compressed by 72.9％and 73.5％,respectively,while the model accuracy was improved by 0.1％.Compared with the Hrank pruning method,CSCTFP can compress more floating-point operations and reduce accuracy loss(the accuracy of the Hrank method decreased by 0.62％).For the efficient residual network ResNet56,the CSCTFP can compress 53.81％of FLOPs with an accuracy increase of 0.33％,and the accuracy loss is much lower than those obtained by SFP,FPGM,and NSPPR.The efficient deep separable network MobileNet V1 can also be effectively compressed,with CSCTFP compressing 46.23％of FLOPs and 46.89％of the parameter quantities,thus improving accuracy by 0.11％.CSCTFP demonstrates a better compression effect than DCP,which reduces accuracy by 0.3％and only compresses 42.86％of FLOPs and 30.07％of parameter quantities.CSCTFP also achieves a good compres-sion performance on highly complex datasets,such as CIFAR100.For VGG16,CSCTFP can compress more FLOPs(33.35％)and experience a much lower accuracy loss compared with Variational and DeepPruningEs.For ResNet56,CSCTFP can compress 43.02％and 40.36％of parameter quantities and FLOPs and achieve an accuracy improvement of 0.48％,while the comparison methods OICSR and NSPPR compress fewer FLOPs and experience higher accuracy loss.In addition,CSCTFP is not only applicable to image classification tasks but also to object detection visual tasks.The light-weight face detection model RetinaFace on the WiderFace dataset performs well on simple and moderate validation sets.CSCTFP is then compared with the assumptions"if the measurement index is small,the measurement object is not impor-tant"and"similarity is redundancy"and continuous to show accuracy improvements at different pruning ratios.Conclu-sion CSCTFP takes into account the uncertainty of the assumptions"if the measurement index is small,the object being measured is not important"and"similarity is redundancy",thus avoiding suboptimal results in the pruned model resulting from the failure of these two assumptions.CSCTFP further improves the accuracy and compression rate of pruning by searching for key filters and using the spatial correlation between filters.A large number of experiments have confirmed the effectiveness of CSCTFP and its advantages over other extant methods.The iterative pruning method used in this article can compress the network model finely,but further research is needed to reduce the loss of time cost and avoid manually setting the pruning ratio.

外文关键词：

deep learningneural networkmodel compressioncosine distancefilter pruning

作者：

廖威、李光辉、代成龙、张飞飞

展开 >

作者单位：

江南大学人工智能与计算机学院,无锡 214122

江苏邦融微电子有限公司,苏州 215000

关键词：

深度学习神经网络模型压缩余弦距离滤波器剪枝

出版年：

2024

DOI：

10.11834/jig.230592

中国图象图形学报

中国科学院遥感应用研究所,中国图象图形学学会 ,北京应用物理与计算数学研究所

中国图象图形学报

CSTPCD北大核心

影响因子：1.111

ISSN：1006-8961

年,卷(期)：2024.29(12)