首页|融入实体翻译的汉越神经机器翻译模型

融入实体翻译的汉越神经机器翻译模型

扫码查看
在汉越低资源翻译任务中,句子中的实体词准确翻译是一大难点.针对实体词在训练语料中出现的频率较低,模型无法构建双语实体词之间的映射关系等问题,构建一种融入实体翻译的汉越神经机器翻译模型.首先,通过汉越实体双语词典预先获取源句中实体词的翻译结果;其次,将结果拼接在源句末端作为模型的输入,同时在编码端引入"约束提示信息"增强表征;最后,在解码端融入指针网络机制,以确保模型能复制输出源端句的词汇.实验结果表明,该模型相较于跨语言模型XLM-R(Cross-lingual Language Model-RoBERTa)的双语评估替补(BLEU)值在汉越方向提升了1.37,越汉方向提升了0.21,时间性能上相较于Transformer该模型在汉越方向和越汉方向分别缩短3.19%和3.50%,可有效地提升句子中实体词翻译的综合性能.
Chinese-Vietnamese neural machine translation model incorporating entity translation
In low-resource Chinese-Vietnamese translation tasks,translating entity words in sentences accurately is a significant challenge.In order to solve the problems such as the low frequency of entity words in training corpus and the inability of the model to construct the mapping relationship between bilingual entity words,a Chinese-Vietnamese neural machine translation model that incorporates entity translation was constructed.Firstly,the translation results of entity words in the source sentence were obtained through a Chinese-Vietnamese bilingual entity dictionary.Then,these results were concatenated at the end of the source sentence as input to the model,and the"constraint prompt information"was introduced at the encoding end to enhance representation.Finally,a pointer network mechanism was integrated at the decoding end to ensure that the model was able to replicate the vocabulary of the source sentence.Experimental results show that this model achieves increases of 1.37 and 0.21 points in BiLingual Evaluation Understudy(BLEU)for Chinese-Vietnamese translation and Vietnamese-Chinese translation compared to the cross-lingual language model—XLM-R(Cross-lingual Language Model-RoBERTa)and shortens training time by 3.19%and 3.50%compared to Transformer for Chinese-Vietnamese translation and Vietnamese-Chinese translation,enhancing the comprehensive performance of entity word translation in sentences effectively.

Chinese-Vietnamese neural machine translationentity translationbilingual dictionarypointer networklow-resource

高盛祥、侯哲、余正涛、赖华

展开 >

昆明理工大学 信息工程与自动化学院,昆明 650504

云南省人工智能重点实验室(昆明理工大学),昆明 650504

汉越神经机器翻译 实体翻译 双语词典 指针网络 低资源

2025

计算机应用
中国科学院成都计算机应用研究所

计算机应用

北大核心
影响因子:0.892
ISSN:1001-9081
年,卷(期):2025.45(1)