TCM Named Entity Recognition Model Combining BERT Model and Lexical Enhancement
There are few researches on TCM named entity recognition,and most of them are based on Chinese medical cases,and they do not perform well in TCM case texts.Aiming at the characteristics of dense named entities and fuzzy boundary in TCM ca-ses,this paper proposes a method of TCM named entity recognition,LEBERT-BILSTM-CRF,which combines lexical enhance-ment and pre-training model.This method is optimized from the perspective of the fusion of vocabulary enhancement and pre-training model,and the vocabulary information is input into the BERT model for feature learning,so as to achieve the purpose of dividing word class boundaries and distinguishing word class attributes,and improve the accuracy of TCM medical case named en-tity recognition.Experiments show that when ten entities are identified on the TCM case data set constructed in this paper,the comprehensive accuracy rate,recall rate and F1 of the TCM case named entity recognition model based on LEBERT-BILSTM-CRF is 88.69%,87.4%and 88.1%,respectively.It is higher than common named entity recognition models such as BERT-CRF and LEBERT-CRF.
Natural language processingChinese medicine caseVocabulary enhancementBERTBiLSTM-CRF