Lightweighting Methods for Neural Network Models:A Review
In recent years,with its strong feature extraction capability,neural network models have been more and more widely used in various industries and have achieved good results.However,with the increasing amount of data and the pursuit of high ac-curacy,the parameter size and network complexity of neural network models increase dramatically,leading to the expansion of computation,storage and other resource overheads,making their deployment in resource-constrained scenarios extremely chal-lenging.Therefore,how to achieve model lightweighting without affecting model performance,and thus reduce model training and deployment costs,has become one of the current research hotspots.This paper summarizes and analyzes the typical model light-weighting methods from two aspects:complex model compression and lightweight model design,so as to clarify the development of model compression technology.The complex model compression techniques are summarized in five aspects:model pruning,model quantization,low-rank decomposition,knowledge distillation and hybrid approach,while the lightweight model design is sorted out in three aspects:spatial convolution design,shifted convolution design and neural architecture search.