首页|基于LSTM和位置增强的软提示向量优化

基于LSTM和位置增强的软提示向量优化

扫码查看
软提示学习是应用预训练语言模型的新兴方法,然而软提示学习所生成的向量可能缺乏序列结构,影响模型在特定位置定义信息的能力导致模型的性能受损。为此,该文深入探究软提示向量序列结构及其对模型性能的影响,发现软提示向量在不同语言模型类型、模型规模、下游任务类型及提示长度均展现出顺序敏感的问题。针对该问题,提出一种基于LSTM和位置增强的软提示排序网络,首先采用改进的LSTM网络实现软提示排序调优,其中对每个门控处添设提示选择门,以捕获序列信息生成优序的软提示向量。其次针对排序过程提出一种位置增强模块,结合绝对与相对位置信息优化排序。在GLUE数据集上的测试表明,该方法相较于基线带来了平均3。1%的性能提升。
Optimization of Soft Prompt Vectors Based on LSTM and Position Enhancement
Soft prompt learning is an emerging method for applying pretrained language models.However,the vectors generated by soft prompt learning may lack sequential structure,affecting the model's ability to define information at specific positions,resulting in impaired model performance.To address this,we delve into the sequential structure of soft prompt vectors and their influence on model performance.It was found that soft prompt vectors exhibit sequence sensitivity issues across different types of language models,model sizes,types of downstream tasks,and prompt lengths.In response,we propose a soft prompt sorting network based on LSTM and position enhancement.Firstly,an improved LSTM network is used for soft prompt sorting optimization,where a prompt selection gate is added at each gate to capture sequence information and generate well-ordered soft prompt vectors.Secondly,a position enhancement module is proposed for the sorting process,optimizing the order by combining absolute and relative position information.Tests on the GLUE dataset show that the proposed method brings an average performance improvement of 3.1%compared to baseline.

soft prompt vectorsequential structureorder sensitivityposition encodinglong short-term memory

刘振东、程春玲、刘倩

展开 >

南京邮电大学 计算机学院,江苏 南京 210023

软提示向量 序列结构 顺序敏感性 位置编码 长短期记忆

国家自然科学基金项目

61972201

2024

计算机技术与发展
陕西省计算机学会

计算机技术与发展

CSTPCD
影响因子:0.621
ISSN:1673-629X
年,卷(期):2024.34(10)