Machine transliteration, the process of automatically converting text from one language to another based on phonetid similarity, is a subtask of machine translation that focuses on the translation of phonetic information. After transliteration, you can know the pronunciation of the source word in another language, making it easier for people who are not familiar with the source language to understand the language, and it is beneficial to eliminate language and spelling barriers. Machine transliteration plays an important role in natural language applications such as multilingual text processing, corpus alignment, and information extraction. This paper expounds the challenges existing in the current machine transliteration tasks, analyzes, categorizes and organizes the main transliteration methods, summarizes the transliteration data sets, and lists the commonly used evaluation indicators of transliteration effects. The existing problems are explained and the future of transliteration is prospected. This article is intended to provide a quick introductory guide for newcomers to the field, or as a reference for other researchers.
音译;综述;语料库;评价指标
李卓、王志娟、赵小兵
展开 >
中央民族大学信息工程学院,北京100081,国家语言资源监测与研究少数民族语言中心
音译;综述;语料库;评价指标
Chinese national conference on computational linguistic
Nanchang(CN)
The 21st Chinese national conference on computational linguistic