维吾尔语中汉族人名的识别及翻译

Recognition and Translation for Chinese Names in Uighur Language

李佳正 ¹刘凯 ¹麦热哈巴·艾力 ²吕雅娟 ¹刘群 ¹吐尔根·依布拉音³

扫码查看

作者信息

1. 中国科学院计算技术研究所中国科学院智能信息处理重点实验室,北京100190
2. 中国科学院计算技术研究所中国科学院智能信息处理重点实验室,北京100190;新疆大学信息科学与工程学院,新疆乌鲁木齐830046
3. 新疆大学信息科学与工程学院,新疆乌鲁木齐830046
折叠

摘要

该文研究了一种维吾尔语中汉族人名的识别和翻译方法.该方法在词典等传统方法的基础上,运用语言模型实现维语中的汉族人名的识别和翻译.针对维语人名的构词和拼写特点,增加了名词词缀识别预处理模块,补充了维语字母到汉语拼音的映射规则,有效提高了人名识别的正确率及召回率.在1000句含有汉族人名的维语语料上进行测试,汉族人名识别的正确率和召回率分别达到75.2%和91.5%.

Abstract

Name translation in the minority languages is still in its infancy. This paper presents a method for recognizing and translating Chinese Names in Uighur Language. In addition to using the traditional rule approach, we use Uighur and Chinese language models to recognize and translate Chinese names in Uighur Language. On this basis, we add the appropriate rules and algorithms to solve the problem of names with noun affixes and incomplete rules. This improves the accuracy of translation and the recall rate. We test the translation system with 1 000 random sentences with Chinese names. The results show that the accuracy can reach 75. 2% and the recall rate can reach 91. 5%.

关键词

语言模型/名词词缀/拼写规则/人名识别及翻译

Key words

language model/noun affixes/spelling rules/recognition and translation of names

引用本文复制引用

基金项目

国家自然科学基金重点项目(60736014)

国家自然科学基金(60873167)

出版年

2011

中文信息学报

中国中文信息学会,中国科学院软件研究所

中文信息学报

CSTPCDCSCDCHSSCD北大核心

影响因子：0.8

ISSN：1003-0077

被引量13

参考文献量4

段落导航