Lao Entity Recognition Based on Cross-language Learning
The classical named entity recognition is based on supervised machine learning,which is difficult to be ap-plied for low-resource languages such as Lao due to the reliance on annotated data.After analysing the structural fea-tures of Chinese and Lao,this paper proposes a named entity recognition method for Lao based on cross-language learning for a large Chinese-Lao parallel sentences.This method first uses the open source named entity recognition tool to annotate the Chinese sentences.Then,it uses the cross-language representation and similarity calculation to project the annotation from the Chinese-side to the Lao language.The final named entity recognition model for Lao is trained by character vector combined with the part-of-speech feature and syllable feature.Experiments show that the F1 value of the proposed method reaches 74.29%for Lao named entity recognition.
Laonamed entity recognitionweakly supervised learningcross-language word vector