Design of Lightweight Model with Improved MetaFormer
In order to improve the remote feature extraction capability of traditional CNN,this paper presents an innovative lightweight CNN network model known as ViTNet,with the deep detachable convolution operation embedded into the improved MetaFormer architecture and with channel shuffle operation added.This fusion strategy not only maintains the flexibility and scalability of ViTs,but also enhances the image feature extraction capability of the CNN model.Furthermore,lightweight enables ViTNet to operate efficiently on devices with restricted computing resources.The experimental results show that on cifar10,ViTNet-1.0× is more competitive than MobileNetV2 with accuracy improved by 1.8% and latency reduced by 32%.