Hourglass attention and progressive hybrid Transformer for image classification
Transformer has a wide range of applications in image classification tasks,but in small dataset classification tasks,Transformer is affected by factors such as small amount of data and excessive amount of model parameters,which leads to low classification accuracy and slow convergence speed.Therefore,a progressive hybrid transformer model with hourglass attention is proposed.Firstly,the global feature relationships are modeled by the hourglass self-attention with down-up sampling,and up-sampling is used to supplement the information lost by the down-sampling operation,while the learning temperature parameters and negative diagonal mask are used to sharpen the fractional distribution of the attention to avoid excessive smoothing due to the excessive number of layers.Secondly,progressive down-sampling modules are designed to obtain the fine-grained multi-scale feature maps,which can effectively capture the low-dimensional feature information.Finally,a hybrid architecture is used,where the designed hourglass attention is used in the top stage,the pooling layer is used in the bottom stage instead of the attention module,and layer normalization with deep convolution is introduced to increase network locality.The proposed method is experimented on T-ImageNet,CIFAR10,CIFAR100,and SVHN datasets,the classification accuracy can reach 97.42%,and the computation and parameters are 3.41G and 25M.The experimental results show that compared with the comparison algorithms,the classification accuracy of the proposed method is significantly improved,with a significant reduction in computation and parameters,which improves the performance of Transformer model on small datasets.
image classification for small datasettransformerhourglass attentionmulti scale featureshybrid architecture