Chinese Medical Named Entity Recognition Based on RBAC Model
Chinese medical named entity recognition aims to extract structured entities from unstructured data.Current mainstream research uses a large amount of training data.Aiming at the problem of lack of training data for Chinese medical named entity recognition,a RoBERTa-BiGRU-Attention-CRF(RBAC)model based on joint segmentation and a novel data enhancement method for named entity recognition based on semantic search are proposed in this article.Specifically,the pretrained model and the Bidirectional Gated Recurrent Unit(BiGRU)are first used to extract the deep bidirectional semantic representation of the text,and then the semantic representation is sent to the word segmentation module and the named entity recognition module respectively.The word segmentation module uses conditional random fields(CRF)to obtain word seg-mentation information.The named entity recognition module uses BiGRU and multi-head attention to obtain a mixed seman-tic representation,and then is sent to CRF to obtain the tag sequence for named entity recognition.Experimental results on the CCKS2019 Chinese electronic medical record datasets showed that the F1 of this method reached 90.5%when the amount of data was small,thus proving the effectiveness of this method.