首页|一种基于多特征融合嵌入的中文命名实体识别模型研究

一种基于多特征融合嵌入的中文命名实体识别模型研究

扫码查看
为解决中文字形上存在差异以及中文词语边界模糊的问题,提出了一种多特征融合嵌入的中文命名实体识别模型.在提取语义特征的基础上,基于卷积神经网络和多头自注意力机制捕获字形特征,并参考词语向量嵌入表获取词语特征,同时利用双向长短期记忆神经网络学习长距离的上下文表示,最后结合条件随机场学习句子序列标签中的约束条件,实现中文命名实体识别.在Resume、Weibo和People Daily数据集上的F1值分别达到了96.66%,70.84%和96.15%,证明提出的模型有效地提高了中文命名实体识别任务的性能.
A Chinese named entity recognition model based on multi-feature fusion embedding
In order to solve the problems of differences in Chinese glyphs and blurred boundaries of Chinese words,a Chinese named entity recognition model based on multi-feature fusion embedding is proposed.On the basis of extracting semantic features,glyph features are captured based on convolu-tional neural network and multi-headed self-attention mechanism,word features are obtained with refer-ence to the word vector embedding table,and the bidirectional long short-term memory neural network is used to learn the context representation of long distance.Finally the constraint conditions in sentence sequence labels are learned by combining the conditional random field to realize Chinese named entity recognition.The Fl values on the Resume,Weibo and People Daily datasets reach 96.66%,70.84%and 96.15%,respectively,which proves that the proposed model effectively improves the performance of Chinese named entity recognition tasks.

named entity recognitionfeature fusionmulti-headed self-attention mechanism

刘晓华、徐茹枝、杨成月

展开 >

华北电力大学控制与计算机工程学院,北京 102206

国家电网有限公司大数据中心,北京 100052

命名实体识别 特征融合 多头自注意力机制

国家自然科学基金

61972148

2024

计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
年,卷(期):2024.46(8)