首页|机器音译研究综述

机器音译研究综述

扫码查看
机器音译是基于语音相似性自动将文本从一种语言转换为另一种语言的过程,它是机 器翻译的一个子任务,侧重于语音信息的翻译。音译后可知道源单词在另一种语言中 的发音,使不熟悉源语言的人更容易理解该语言,有益于消除语言和拼写障碍。机器 音译在多语言文本处理、语料库对齐、信息抽取等自然语言应用中发挥着重要作用。 本文阐述了目前机器音译任务中存在的挑战,对主要的音译方法进行了剖析、分类和 整理,对音译数据集进行了罗列汇总,并列出了常用的音译效果评价指标,最后对该 领域目前存在的问题进行了说明并对音译学的未来进行了展望。本文以期对进入该领 域的新人提供快速的入门指南,或供其他研究者参考。
机器音译研究综述
Machine transliteration, the process of automatically converting text from one language to another based on phonetid similarity, is a subtask of machine translation that focuses on the translation of phonetic information. After transliteration, you can know the pronunciation of the source word in another language, making it easier for people who are not familiar with the source language to understand the language, and it is beneficial to eliminate language and spelling barriers. Machine transliteration plays an important role in natural language applications such as multilingual text processing, corpus alignment, and information extraction. This paper expounds the challenges existing in the current machine transliteration tasks, analyzes, categorizes and organizes the main transliteration methods, summarizes the transliteration data sets, and lists the commonly used evaluation indicators of transliteration effects. The existing problems are explained and the future of transliteration is prospected. This article is intended to provide a quick introductory guide for newcomers to the field, or as a reference for other researchers.

音译;综述;语料库;评价指标

李卓、王志娟、赵小兵

展开 >

中央民族大学信息工程学院,北京100081,国家语言资源监测与研究少数民族语言中心

音译;综述;语料库;评价指标

Chinese national conference on computational linguistic

Nanchang(CN)

The 21st Chinese national conference on computational linguistic

317-332

2022