基于增大隐表示差异的鲁棒性机器翻译方法

Robust Machine Translation Based on Amplifying Hidden Representation Differences

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：对比学习是当前机器翻译鲁棒性研究的主流方法.该方法通常在输入Token层或者Embedding层加入噪声,以扩大样本库并丰富样本风格.然而,噪声样本在经过Encoder处理后,会减弱其与干净样本在隐表示上的差异性,从而限制了对比学习方法的性能.该文通过在Encoder隐表示上直接添加高斯噪声,保持了噪声样本和干净样本在隐表示上的差异性.在Decoder端,通过联合训练噪声样本损失和KL散度损失,最小化KL散度损失使噪声样本的目标概率分布接近干净样本的目标概率分布.在IWSLT2014 De-En任务上,相对于强对比系统R-Drop和SimCut,在干净测试集上提升了 0.9 BLEU,在噪声测试集上,分别提升0.82 BLEU和0.63 BLEU,显著提升了模型的翻译效果,并增强了模型对噪声输入的鲁棒性.该技术应用到语音翻译(Speech-to-Text)任务上,在MuST-C测试集和CoVoST 2多说话人测试集上,相对于强对比系统ConST,分别提升1.3 BLEU和3.0 BLEU.相比多任务学习基线系统(MTL),分别提升1.8 BLEU和1.5 BLEU,同样显著提升了翻译效果.

外文摘要：Contrastive learning,as the the mainstream method in robust machine translation,usually adds noise to the input Token layer or the Embedding layer to expand the sample pool and enrich sample styles.However,after being processed by the Encoder,the differences between noise samples and clean samples in Hidden Representations will be decreases.In this paper,we maintain the dissimilarity between noise samples and clean samples in Hidden Representations by directly adding Gaussian noise to the Encoder's Hidden Representations.On the Decoder side,by jointly training the loss of noise samples and KL divergence loss,the target probability distribution of the noise samples is approximated close to that of the clean samples.In the IWSLT2014 De-En task,the proposed method achieves 0.9 BLEU improvement on the clean test set compared to the R-Drop and SimCut.On the noisy test set,the proposed method achieves 0.82 BLEU and 0.63 BLEU improvements,respectively.Applied to the Speech-to-Text(ST)task,the proposed method brings1.3 BLEU improvement on the MuST-C test set,and 3.0 BLEU improvement on the multi-speaker test set of CoVoST 2.in contrast to ConST system.

外文关键词：

neural machine translationrobust machine translationcontrastive learningspeech translation

作者：

薛征山、史庭训、熊德意、汪浩

展开 >

作者单位：

天津大学智能与计算学部,天津 300350

北京欧珀通信有限公司OPPO研究院,北京 100026

关键词：

神经机器翻译鲁棒性机器翻译对比学习语音翻译

出版年：

2024

中文信息学报

中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心

影响因子：0.8

ISSN：1003-0077

年,卷(期)：2024.38(12)