Named entity recognition based on span and category enhancement for Chinese news
In the field of news,the identification of named entities is complicated by complex syntactic structures and long entity names,which pose challenges for determining entity boundaries and lead to interruptions in predicting long en-tities using sequence labeling methods.To address these challenges,a model named SpaCE(span and category enhance-ment for Chinese news named entity recognition)was proposed.This model was developed based on the bidirectional en-coder representation pre-trained model with a Transformer structure(BERT)and was enhanced by span prediction and category description to improve recognition performance.During the encoding of news text information,category descrip-tions were incorporated to enhance semantic knowledge,and a span-based decoding method was adopted to address inter-ruptions in predicting long entities.Furthermore,word boundary information was introduced through precise labeling,and the entity matching strategy was optimized,effectively reducing non-entity matching caused by span decoding.Com-pared to baseline models,SpaCE demonstrated improved performance on three datasets.Furthermore,SpaCE exhibits strong named entity recognition capabilities on disordered texts,indicating its robustness.
news named entity recognitionBERTspancategory enhancementword boundary information