首页|基于差异化学习的Transformer改进方法研究

基于差异化学习的Transformer改进方法研究

扫码查看
以Transformer为代表的神经机器翻译模型是目前机器翻译领域中的研究热点.多头注意力机制是Transformer的重要组成部分,其作用是增强模型提取不同信息的能力,提高模型的泛化性.但是多头注意力机制中存在部分自注意力头失效的问题.针对此问题,本文提出了基于差异化学习的Transformer改进方法,通过在Transformer的训练过程中使用新颖的差异化学习方法充分提高自注意力头的有效性.在多项机器翻译任务中的实验结果表明,相比原始的Transformer,基于差异化学习方法改进的Transformer可以取得更高的BLEU值.
Research on Improved Method of Transformer Based on Differentiated Learning
The neural machine translation model represented by Transformer is the current research hotspot in the field of machine translation.The multi-head attention mechanism is an important part of Transformer.Its function is to enhance the model's ability to extract different information and improve the generalization of the model.However,there is a problem that some self-attention heads fail in the multi-head attention mechanism.In response to this problem,this paper proposes a Transformer improvement method based on differentiated learning,which can fully improve the effectiveness of the self-attention head by using novel differentiated learning methods in the training process of Transformer.Experimental results in a number of machine translation tasks show that,compared to the original Transformer,the improved Transformer based on the differentiated learning method can achieve a higher BLEU value.

machine translationTransformermulti-head attentiondifferentiated learning

丁义

展开 >

德州学院,山东德州 253023

机器翻译 Transformer 多头注意力 差异化学习

2024

软件
中国电子学会 天津电子学会

软件

影响因子:1.51
ISSN:1003-6970
年,卷(期):2024.45(7)