基于差异化学习的Transformer改进方法研究

Research on Improved Method of Transformer Based on Differentiated Learning

扫码查看

原文链接

维普
万方数据

中文摘要：以Transformer为代表的神经机器翻译模型是目前机器翻译领域中的研究热点.多头注意力机制是Transformer的重要组成部分,其作用是增强模型提取不同信息的能力,提高模型的泛化性.但是多头注意力机制中存在部分自注意力头失效的问题.针对此问题,本文提出了基于差异化学习的Transformer改进方法,通过在Transformer的训练过程中使用新颖的差异化学习方法充分提高自注意力头的有效性.在多项机器翻译任务中的实验结果表明,相比原始的Transformer,基于差异化学习方法改进的Transformer可以取得更高的BLEU值.

外文摘要：The neural machine translation model represented by Transformer is the current research hotspot in the field of machine translation.The multi-head attention mechanism is an important part of Transformer.Its function is to enhance the model's ability to extract different information and improve the generalization of the model.However,there is a problem that some self-attention heads fail in the multi-head attention mechanism.In response to this problem,this paper proposes a Transformer improvement method based on differentiated learning,which can fully improve the effectiveness of the self-attention head by using novel differentiated learning methods in the training process of Transformer.Experimental results in a number of machine translation tasks show that,compared to the original Transformer,the improved Transformer based on the differentiated learning method can achieve a higher BLEU value.

外文关键词：

machine translationTransformermulti-head attentiondifferentiated learning

作者：

丁义

展开 >

作者单位：

德州学院,山东德州 253023

关键词：

机器翻译 Transformer 多头注意力差异化学习

出版年：

2024

DOI：