基于域对抗迁移学习的低资源机器翻译

扫码查看

原文链接

万方数据
维普

中文摘要：当域外和域内分别表示不同的语言时,语言之间的差异会导致域外知识难以适应至域内.因此提出域对抗迁移学习方法来改进机器翻译模型.采用对抗学习方法,加入一个域判别器对域外和域内的语义特征进行预测,通过最小化域外和域内语义特征预测值优化编码器.当两个领域的语义特征预测值相近时,说明模型学习到一个可以把域内数据映射到域外的映射函数.通过实验,该方法在蒙古语-汉语和维吾尔语-汉语等翻译任务上展现出一定的泛化能力.

外文标题：Domain-adversarial Transfer Learning for Low-resource Neural Machine Translation

外文摘要：When the out-of-domain and in-domain represent different languages,the differences between languages will make it difficult adapt the out-of-domain knowledge to the in-domain.This paper proposes a domain-adversarial transfer learning method to improve the neural machine translation model.Under the adversarial learning frame-work,a domain discriminator is employed to predict the semantic features that from out-of-domain or in-domain,and the encoder is optimized by minimizing the prediction values of the semantic features.When the predicted values of semantic features in the two domains are similar,it means that the model has learned the mapping function that can transfer in-domain data into out-of-domain.Experiments show a certain generalization ability of this method on Mongolian-Chinese and Uyghur-Chinese translation tasks.

外文关键词：

domain adaptionmachine translationmulti-languageadversarial learning

作者：

常鑫、侯宏旭、乌尼尔、贾晓宁、李浩然

展开 >

作者单位：

内蒙古大学计算机学院,内蒙古呼和浩特 010021

关键词：

对抗机器翻译多语言对抗学习

基金：

内蒙古自治区科技成果转化专项内蒙古自然科学基金内蒙古自然科学基金

项目编号：

2019CG0282018MS0600514020202-0114

出版年：

2024

中文信息学报

中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCHSSCD北大核心

影响因子：0.8

ISSN：1003-0077

年,卷(期)：2024.38(6)