Chinese named entity recognition based on enhancing lexicon knowledge integration utilizing character context information
Chinese named entity recognition(NER)is a challenging task due to the lack of explicit delimiters in the Chinese language,which leads to the absence of word boundary information.Existing mainstream mod-els address this issue by introducing lexicon for Chinese NER,which provides word boundary information.However,the word information contained in lexicon is fused into the character representations according to the matching relation between characters and words,without considering the impact of sentence information on word selection.The results in the introduction of irrelevant words that are unrelated to sentence semantics,leading the model to incorrectly perceive word boundary information.To reduce the impact of irrelevant words on entity recognition results,this paper proposes a novel Chinese NER method,called ELKI,which integrates lexicon knowledge with character-context representations that capture sentence semantic informa-tion,thereby improving the accuracy of word boundary perception.Specifically,a novel relation-aware character-word cross-attention network is designed to mine word representation that is related to the semantic information from the lexicon.Then,a gated fusion network is constructed to dynamically fuse the lexicon knowledge representation of each character with its context representation.The proposed model is evaluated on three benchmark datasets,Resume,MSRA and OntoNotes,and it outperforms other baseline models.
Chinese named entity recognitionCross-attention networkGated fusion networkInformation extraction