首页|寻找机器翻译痕迹——神经机器翻译文本的句法特征研究

寻找机器翻译痕迹——神经机器翻译文本的句法特征研究

扫码查看
神经机器翻译日趋成熟,但译文还会带有"机器翻译痕迹",如译文难以理解、语言不够地道等.现有研究尚不明确"机器翻译痕迹"的语言学表现,尤其对机翻译本的深层句法特征了解甚少.本研究自建人工译本与神经网络机翻译本依存树库,使用依存距离、依存方向等指标对比英译汉方向人、机译本的句法特征.研究发现,神经机器翻译系统对长句的句法复杂度控制不足,表现在对被动结构和介词短语的翻译不够地道,相关结构和短语可能会增加译文的理解难度;机翻译本中的状中、右附加、介宾等依存关系的使用遗留了英语对名词性结构的使用倾向,这导致人、机译本在语序分布上也存在差异.本研究在句法层面捕捉到的这些"机器翻译痕迹",对评估翻译质量和译后编辑等具有一定参考价值.
Uncovering machine translationese:On syntactic properties of neural machine-translated texts
Despite advances in neural machine translation(NMT),the persistence of"machine translationese,"characterised by unsatisfactory intelligibility and idiomaticity,remains a challenge.Existing studies have not clarified what"machine translationese"is,and little is known about its deep syntactic properties.Based on self-built dependency treebanks consisting of machine-and human-translated texts in the English-to-Chinese direction,we compared the syntactic properties of these texts in terms of dependency distance and dependency direction.The findings indicate that NMT is significantly deficient in controlling the syntactic complexity of long sentences,as evidenced by the improper use of passive structures and prepositional phrases,both of which may contribute to unintelligibility.Additionally,a preference for nominal structures,which characterise English,is evident in machine translation through the use of adverbial,right adjunct,and prepositional-object relations.This leads to differences in the word order distribution between the human-and machine-translated texts.These examples of machine translationese,captured at the syntactic level,shed light on machine translation quality assessment and post-editing.

沈梦菲、黄伟

展开 >

北京语言大学

机器翻译 句法特征 依存语法 翻译质量

2024

外语教学与研究
北京外国语大学

外语教学与研究

CSTPCDCSSCICHSSCD北大核心
影响因子:3.149
ISSN:1000-0429
年,卷(期):2024.56(3)
  • 37