首页|我国古代典籍时代特征视角下的机器翻译研究

我国古代典籍时代特征视角下的机器翻译研究

扫码查看
中国存世典籍成书于不同时代,典籍文本的语体风格及内容均具有时代性。文章以古代汉语到现代汉语的机器翻译为切入点,探究典籍文本的时代特征及其对中国古代典籍机器翻译的影响,提出针对不同历史时期训练翻译模型的策略,以提高古文翻译质量。以《二十四史全译》为研究语料,将语料划分为远古、中古、近古三个时期,从计算人文视角利用统计计量的方法对不同历史时期典籍文本的词频、词性、依存关系进行比较分析;在数据增强的基础上,利用每个时期的语料分别训练多种机器翻译模型并比较翻译效果。研究发现:典籍文本存在时代特征差异,并会对机器翻译效果产生显著影响;针对不同时期典籍文本分别训练机器翻译模型,能够提高古文翻译的准确性和流畅性。
A Study on Machine Translation of Ancient Chinese Books from the Perspective of Temporal Characteristics
The language style and content of the surviving Chinese historical classics are characteristic of their time.Taking the translation from ancient Chinese to modern Chinese as the starting point,this article explores the temporal characteristics of ancient Chinese books and their influence on machine translation of Chinese historical classics,and proposes strategies for training translation models tailored to different historical periods,with the aim of improving the quality of translation of ancient texts.The article takes A Complete Translation of Twenty-Four Histories as the research corpus,which has been divided into three periods,namely the ancient,the medieval,and the near ancient.From the perspective of computational humanities,it conducts a comparative analysis of word frequency,lexicality,and dependency relationships in classical texts of different historical periods,using the statistical methods.Based on data augmentation,the corpora in each period are used to train different machine translation models respectively and to compare the translation effects.This study shows that there are differences in the temporal characteristics of ancient classics,which have a significant impact on the machine translation effect.Training machine translation models separately for classics from different periods can improve the accuracy and fluency of ancient text translation.

computational humanitiesTwenty-four Historieshistorical characteristics of classicsdata augmentationmachine translation

吴梦成、林立涛、胡蝶、刘畅、黄水清、孟凯、王东波

展开 >

南京农业大学信息管理学院

南京大学信息管理学院

南京农业大学马克思主义学院

计算人文 二十四史 典籍时代特征 数据增强 机器翻译

2024

图书馆论坛
广东省立中山图书馆

图书馆论坛

CSTPCDCSSCICHSSCD北大核心
影响因子:1.864
ISSN:1002-1167
年,卷(期):2024.44(10)