面向CNN和Transformer的自注意力机制自适应性提示学习

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：随着大规模预训练模型对视觉领域中的一般性数据的深入研究,当将其应用于特定下游任务时,若模型只训练分类头方法则极其依赖于预训练模型且效果一般;而全面微调预训练模型也因模型参数过大而变得不切实际;另外如VPT等视觉提示学习方法在图像数据集具有很大的数据多样性时,每个数据集的通用提示在向原始预训练数据分布转变时会带来极大的挑战.基于以上的种种挑战,本文提出一种新的提示学习方法,即在输入空间中添加特定任务的自注意力机制提示块,并在增强通道间的竞争条件下,引入极小的参数量进行预训练模型的自适应性调整,最终实现将视觉领域中具有一般性的特征信息应用于特定的视觉任务.实验以CNN和Transformer代表性的网络为基础模型并选取 CIFAR、Tiny ImageNet等数据集,结果表明本文提出的方法相比常见的微调方法在平均准确率上提高了 0.55％、1.86％.

外文标题：Self-attention Mechanism Adaptive Prompt Learning for CNNs and Transformers

外文摘要：As large-scale pre-trained models are deeply studied for generalized data in the visual domain,when applying them to specif-ic downstream tasks,if the models are trained only for classification head methods,they are extremely dependent on the pre-trained models and the results are mediocre;while fully fine-tuning the pre-trained models becomes impractical due to the overly large model parameters;and also visual prompt learning methods,such as VPT,are not effective in the case where the image datasets have a large diversity of data.Generalized prompts for each dataset pose a great challenge when shifting to the original pre-training data distribu-tion.Based on the above challenges,this paper proposes a new prompt learning method,i.e.,adding task-specific self-attention mecha-nism prompt blocks in the input space,and introducing a very small number of parameters under enhanced inter-channel competition to adaptively adjust the pre-training model,and ultimately realizing the application of generalized feature information in the visual domain to a specific visual task.The experiments are conducted with the representative networks of CNN and Transformer as the base model and selected datasets such as CIFAR and Tiny ImageNet,and the results show that the method proposed in this paper improves the av-erage accuracy by 0.55％and 1.86％compared with common fine-tuning methods.

外文关键词：

fine-tuning of modelsdata diversityprompt learningself-attentive mechanism prompt blocksadaptive tuning

作者：

杨鹏跃、王锋、魏巍

展开 >

作者单位：

山西大学计算机与信息技术学院,太原 030006

关键词：

模型的微调数据多样性提示学习自注意力机制提示块自适应性调整

出版年：

2025

DOI：

10.20009/j.cnki.21-1106/TP.2023-0416

小型微型计算机系统

中国科学院沈阳计算技术研究所

小型微型计算机系统

北大核心

影响因子：0.564

ISSN：1000-1220

年,卷(期)：2025.46(1)