Chinese Named Entity Recognition Based on Multi-level Features of Chinese Characters and Local Features of Text
To improve the Chinese named entity recognition model,this paper proposes to introduce more Chinese character features to make up for the deficiency of the word vector in character form and pronunciation,and more prior knowledge to enrich the semantic features.It designs a local feature extractor considering both global and local features,so as to improve the robustness and generalization of the model in the face of complex contexts.The influ-ence of eight different Chinese character coding methods is also explored,disclosing that the initials and finals of Chinese characters carry more pronunciation information,and features such as tone and polyphonic characters are al-so beneficial to improve the model performance.The experimental results show that the proposed method improves the F1 value by 1.61,0.37,0.98 and 0.98 respectively on Weibo,OntoNotes5.0,Boson and People Daily datasets,which proves the importance and universality of Chinese character features,and also proves that local features of text are helpful to improve the model performance.In addition,the influence of eight different Chinese character coding methods on the model performance is also explored.Experimental results show that compared with a single pinyin character,the initials and finals of Chinese characters carry more pronunciation information,and features such as tone and polyphonic characters are also beneficial to improve the model performance.Finally,the performance of the model is tested on a variety of text examples,and the experimental results show the effectiveness of the proposed work.
character featurespinyin featureslocal features of textnamed entity recognition