日语学习与研究2024,Issue(6) :63-73.

计量文体学视角下的汉日机器翻译语言特征研究——以《阿Q正传》日译本为例

A Study on the Linguistic Characteristics of Chinese-Japanese Machine Translation from the Perspective of Corpus Stylistics:A Case Study of the Japanese Translation of The True Story of Ah Q

毛文伟 朱海莹
日语学习与研究2024,Issue(6) :63-73.

计量文体学视角下的汉日机器翻译语言特征研究——以《阿Q正传》日译本为例

A Study on the Linguistic Characteristics of Chinese-Japanese Machine Translation from the Perspective of Corpus Stylistics:A Case Study of the Japanese Translation of The True Story of Ah Q

毛文伟 1朱海莹1
扫码查看

作者信息

  • 1. 上海外国语大学 日本文化经济学院
  • 折叠

摘要

以《阿Q正传》的人工译本为参照系,运用计量文体学方法考察汉日机器翻译文本,可以揭示其在词汇构成、文体、遣词造句以及句间承接等方面的特征.词汇丰富度分析发现,GPT-3.5用词最为单调,与人工译本差异显著.Google、GPT-4译本与人工译本的差别不明显.但机器翻译不会主动补译,不利于准确传递源文本蕴含的社会、文化、风俗等方面的丰富信息.词源分析发现,机器译本更多地采用了汉源词和外来语词汇,表达更接近书面语,不如人工译本生动、自然.聚类分析发现,机器译本与人工译本在文体特征方面差异显著.前者使用动词偏多,使用修饰词偏少,MVR值低,表达的精确度和语气的丰富度较低,语气较为平淡,情感色彩不足.这对于翻译小说、戏剧等需要传递细腻语义、情感的作品尤为不利.在句间承接方面,无论是Google,还是ChatGPT,所用接续词都不如人类译者丰富、准确,且表现出更强的规范化翻译共性特征.这使得部分译文未能正确还原源文本的前后句逻辑关系,影响了译文的忠实性和可读性.这些都是译后编辑需要着重解决的问题.

Abstract

This paper utilizes a corpus stylistics approach to examine the characteristics of Chinese-to-Japanese machine-trans-lated texts in terms of vocabulary composition,style,word usage,and inter-sentence cohesion,using the human translation of The True Story of Ah Q as a reference.The analysis of lexical richness indicates that GPT-3.5 employs the most monotonous vocabulary,showing significant discrepancies from the human translation.The differences between the translations by Google,GPT-4,and the human-translated versions are not significant.However,machine translations do not perform additive transla-tion,hindering the conveyance of embedded social,cultural,and customary information from the source language.Etymologi-cal analysis reveals that machine translations predominantly utilize Chinese and foreign loanwords,resulting in expressions that are closer to written language and less vivid and natural compared to human translations.Cluster analysis shows signifi-cant stylistic differences between machine and human translations.The former tends to overuse verbs and underuse modifiers,with low MVR(Modifier-Verb Ratio)values,leading to less precision and a paucity of expressive richness and emotional depth,which is particularly detrimental for translating nuanced and emotive texts like novels and scripts.In terms of inter-sentence co-hesion,both Google and ChatGPT use fewer connectives than human translators,exhibiting stronger standardized translation traits.This results in some translated texts failing to accurately restore the logical relationships between sentences in the source text,affecting the fidelity and readability of the translation.These issues are critical in the post-editing process and need to be addressed to enhance the quality of machine translations.

关键词

机器翻译/计量文体学/生成式AI/语言特征/译后编辑

Key words

machine translation/corpus stylistics/generative AI/linguistic characteristics/post-editing

引用本文复制引用

出版年

2024
日语学习与研究
对外经济贸易大学

日语学习与研究

CHSSCD
影响因子:0.341
ISSN:1002-4395
段落导航相关论文