首页|四款主流语料对齐工具性能对比探析

四款主流语料对齐工具性能对比探析

扫码查看
利用实验研究的方法,以科技、旅游、文学、政治四种中英双语文本为样本,对四款语料对齐工具进行了对比研究.研究发现,从基础技术指标来看,Matecat Aligner、ABBYY Aligner 2.0、Tmxmall Aligner更有优势;从非基础技术指标来看,Matecat Aligner和 Déjà Vu X3 Alignment在断句准确率方面更为突出,ABBYY Aligner 2.0 在对齐准确率方面要优于其他工具;ABBYY Aligner 2.0 和 Matecat Aligner具有纠错功能.通过具体分析发现,使用不同类型的文本,对齐质量也有所不同;不同的语料对齐工具适合不同文本的对齐.
COMPARISON AND EVALUATION OF THE PERFORMANCE OF FOUR MAINSTREAM BILINGUAL CORPUS ALIGNMENT TOOLS
This paper describes an experiment to compare and evaluate the performance of four bilingual corpus alignment tools using English/Chinese texts of science and technology,tourism,literature,and politics as samples.Our results showed that the performance of Matecat Aligner and Tmxmall Aligner were better in terms of basic technical indicators such as the size of the aligned files,the types of the alignment,supported text languages and file formats;In terms of non-basic technical indicators,Matecat Aligner and Déjà Vu X3 Alignment were more prominent in segmentation accuracy,and ABBYY Aligner 2.0 outperformed the other tools in terms of align-ment quality;ABBYY Aligner 2.0 and Matecat Aligner offered the feature of correctly aligning the following segments after the formal non-aligned source and target segments.Through specific analysis,it was found that the alignment quality was different when different types of texts were used;Different corpus alignment tools were suitable for the alignment of different texts.

bilingual corpus alignment toolscomparison of their performanceevaluation

王琴、王宇春

展开 >

山西工商学院 外国语学院 山西 太原 030036

语料对齐工具 性能对比 评价

中国高教学会高等教育科研规划项目(2023)山西工商学院教学改革创新项目(2023)

23XJH0410JG202350

2024

南阳理工学院学报
南阳理工学院

南阳理工学院学报

CHSSCD
影响因子:0.178
ISSN:1674-5132
年,卷(期):2024.16(1)
  • 6