Objective Convolutional neural networks have made breakthroughs in computer vision,speech recognition,and other fields.However,with the continuous pursuit for neural network models with excellent performance,the structure of these models has become increasingly complex as mainly reflected in the width and number of their layers.Accordingly,the size and computing resource requirements of these models are also constantly expanding,and such a huge resource con-sumption limits these models to server platforms with unlimited computing power and other resources.As deep learning net-works gradually integrate into the application end devices,many network models cannot be deployed on resource-constrained embedded end devices,such as smartphones,low-end mainboards,and edge devices.To address the contra-diction between the computing resource requirements of network models and resource-constrained embedded devices,the existing complex models should be compressed.Based on the extant model pruning methods,this article proposes a two-stage filter pruning method that incorporates cosine spatial correlation(CSCTFP),which improves pruning performance by utilizing the spatial correlation between filters.CSCTFP also relies on such spatial correlation to identify the filter bank that contributes the most to the network,thus avoiding the secondary model pruning results caused by the assumption that"if the measurement index is small,the measurement object is not important".Method The existing model pruning methods are mainly divided into two types.The first type,called unstructured pruning,uses the weight parameters of the filter as the minimum pruning unit.However,this pruning method leads to the unstructured sparsity of the filter.The network structure after pruning cannot use the existing software and hardware to achieve an acceleration effect but needs to design a corresponding accelerator to speed up the calculation of unstructured sparse matrix.The second type,called structured pruning,takes the whole filter as the smallest pruning unit.This pruning method makes the network structure appear struc-tured and sparse,thus facilitating the use of existing software and hardware for acceleration.The existing filter pruning methods mainly use the assumption that"if the measurement index is small,the measurement object is not important"as an important evaluation criterion for filters,such as using the kernel norm of the filter as the measurement importance index.Alternatively,the"similarity is redundancy"assumption can be used as a criterion for evaluating filter redundancy,such as using the distance between filters as a measure of redundancy.The above two assumptions need to meet the prerequisite conditions,and they do not always hold true in actual scenarios.CSCTFP aims to address these shortcomings as follows.First,in the pre-pruning stage of the model,instead of deleting small norm filters,CSCTFP identifies the filter represented by the maximum norm value,which is referred to as the key filter in this article.Second,in the pruning stage,a set of fil-ters that are highly correlated with the key filters is preserved by computing the cosine distance.Measuring the correlation between filters in these two stages also avoids poor pruning results when the above two assumptions do not hold.Result Experiments were conducted using various network structures,such as visual geometry group(VGG)16,residual neural network(ResNet)56,and MobileNet V1,to verify that the proposed method can be adapted to different types of network models with sequential,residual,and deeply separable structures.The experimental results on datasets CIFAR10 and CIFAR100 were compared with those of previous methods.On the CIFAR10 dataset,the parameter count and floating-point operations(FLOPs)of VGG16 were compressed by 72.9%and 73.5%,respectively,while the model accuracy was improved by 0.1%.Compared with the Hrank pruning method,CSCTFP can compress more floating-point operations and reduce accuracy loss(the accuracy of the Hrank method decreased by 0.62%).For the efficient residual network ResNet56,the CSCTFP can compress 53.81%of FLOPs with an accuracy increase of 0.33%,and the accuracy loss is much lower than those obtained by SFP,FPGM,and NSPPR.The efficient deep separable network MobileNet V1 can also be effectively compressed,with CSCTFP compressing 46.23%of FLOPs and 46.89%of the parameter quantities,thus improving accuracy by 0.11%.CSCTFP demonstrates a better compression effect than DCP,which reduces accuracy by 0.3%and only compresses 42.86%of FLOPs and 30.07%of parameter quantities.CSCTFP also achieves a good compres-sion performance on highly complex datasets,such as CIFAR100.For VGG16,CSCTFP can compress more FLOPs(33.35%)and experience a much lower accuracy loss compared with Variational and DeepPruningEs.For ResNet56,CSCTFP can compress 43.02%and 40.36%of parameter quantities and FLOPs and achieve an accuracy improvement of 0.48%,while the comparison methods OICSR and NSPPR compress fewer FLOPs and experience higher accuracy loss.In addition,CSCTFP is not only applicable to image classification tasks but also to object detection visual tasks.The light-weight face detection model RetinaFace on the WiderFace dataset performs well on simple and moderate validation sets.CSCTFP is then compared with the assumptions"if the measurement index is small,the measurement object is not impor-tant"and"similarity is redundancy"and continuous to show accuracy improvements at different pruning ratios.Conclu-sion CSCTFP takes into account the uncertainty of the assumptions"if the measurement index is small,the object being measured is not important"and"similarity is redundancy",thus avoiding suboptimal results in the pruned model resulting from the failure of these two assumptions.CSCTFP further improves the accuracy and compression rate of pruning by searching for key filters and using the spatial correlation between filters.A large number of experiments have confirmed the effectiveness of CSCTFP and its advantages over other extant methods.The iterative pruning method used in this article can compress the network model finely,but further research is needed to reduce the loss of time cost and avoid manually setting the pruning ratio.
deep learningneural networkmodel compressioncosine distancefilter pruning