首页|融合乌尔都语词性序列预测的汉乌神经机器翻译

融合乌尔都语词性序列预测的汉乌神经机器翻译

扫码查看
面向南亚和东南亚的小语种机器翻译,目前已有不少研究团队开展了深入研究,但作为巴基斯坦官方语言的乌尔都语,由于稀缺的数据资源和与汉语之间的巨大差距,有针对性的汉乌机器翻译方法研究非常稀少。针对这种情况,提出了基于Transformer的融合乌尔都语词性序列的汉乌神经机器翻译模型。首先利用Transformer对目标语言乌尔都语的词性序列进行预测,然后将翻译模型的预测结果和词性序列模型的预测结果相结合进行联合预测,从而实现语言知识到翻译模型的融入。在现有小规模汉乌数据集上的实验表明,所提方法在数据集上的BLEU值相较于基准模型提升了0。13,取得了较为明显的效果。
Chinese-Urdu neural machine translation interacting POS sequence prediction in Urdu language
At present,many research teams have conducted in-depth research on minority language machine translation for South and Southeast Asia.However,as the official language of Pakistan,Urdu has limited data resources and a significant gap from Chinese,resulting in a lack of targeted research on Chinese-Urdu machine translation methods.To address this issue,this paper proposes a Chinese-Urdu neural machine translation model based on Transformer and incorporating Urdu part-of-speech sequence prediction.Firstly,Transformer is used to predict the part-of-speech sequence of the target language Urdu.Then,the translation model's prediction results are combined with the part-of-speech sequence prediction model's results to jointly predict the final translation,thereby integrating language knowledge into the translation model.Experimental results on a small-scale Chinese-Urdu dataset show that the proposed method has a BLEU score of 0.13 higher than the baseline model on the dataset,achieving sig-nificant improvement.

Transformerneural machine translationUrdupart of speech sequence

陈欢欢、王剑、Muhammad Naeem Ul Hassan

展开 >

昆明理工大学信息工程与自动化学院,云南 昆明 650500

昆明理工大学云南省人工智能重点实验室,云南 昆明 650500

Transformer 神经机器翻译 乌尔都语 词性序列

国家自然科学基金国家自然科学基金

6216602262266028

2024

计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
年,卷(期):2024.46(3)
  • 18