Self-attention Mechanism Adaptive Prompt Learning for CNNs and Transformers
As large-scale pre-trained models are deeply studied for generalized data in the visual domain,when applying them to specif-ic downstream tasks,if the models are trained only for classification head methods,they are extremely dependent on the pre-trained models and the results are mediocre;while fully fine-tuning the pre-trained models becomes impractical due to the overly large model parameters;and also visual prompt learning methods,such as VPT,are not effective in the case where the image datasets have a large diversity of data.Generalized prompts for each dataset pose a great challenge when shifting to the original pre-training data distribu-tion.Based on the above challenges,this paper proposes a new prompt learning method,i.e.,adding task-specific self-attention mecha-nism prompt blocks in the input space,and introducing a very small number of parameters under enhanced inter-channel competition to adaptively adjust the pre-training model,and ultimately realizing the application of generalized feature information in the visual domain to a specific visual task.The experiments are conducted with the representative networks of CNN and Transformer as the base model and selected datasets such as CIFAR and Tiny ImageNet,and the results show that the method proposed in this paper improves the av-erage accuracy by 0.55%and 1.86%compared with common fine-tuning methods.
fine-tuning of modelsdata diversityprompt learningself-attentive mechanism prompt blocksadaptive tuning