结合ChineseBERT和多特征网络的数学命名实体识别

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对基础深度学习模型特征提取能力不足,词向量语义表达不准确等问题,提出了结合Chi-neseBERT 和多特征网络的数学命名实体识别模型.ChineseBERT结合当前词的上下文动态调整向量表示,提高词向量语义表示准确性;多特征网络通过改进的卷积网络和双向简单循环单元同时捕捉字符局部和全局序列特征,软注意力机制识别出对实体识别影响较大的关键特征,由条件随机场输出识别结果.在真实数学数据集进行实验,结果表明该模型F1分数达到了 97.67％,高于近期表现较好的深度学习模型,简单循环单元训练效率更高,证明了模型的有效性.

外文标题：Mathematical named entity recognition based on the combination of ChineseBERT and multi-feature network

外文摘要：To address the problems of insufficient feature extraction ability and inaccurate semantic expression of word vectors in the basic deep learning model,a mathematical named entity recognition model combining ChineseBERT and multi-feature network is proposed.ChineseBERT combines the context of the current word to dynamically adjust the vector representation and improve the accuracy of the semantic representation of the word vector.The multi-feature network captures the local and global sequence features of characters simultane-ously through the improved convolution network and the bidirectional simple recurrent unit.The soft attention mechanism recognizes the key features that have a great impact on entity recognition,and the recognition re-sults are output by the conditional random field.Experiments on real mathematical data sets show that the F1 score of the model reaches 97.67％,which is higher than the deep learning model with good performance in re-cent years.The training efficiency of simple recurrent unit is higher,which proves the effectiveness of the model.

外文关键词：

named entity recognitionChineseBERTmulti-feature networkmulti-scale convolutionsoft attention

作者：

白建侠

展开 >

作者单位：

天津仁爱学院数学教学部,天津 301636

关键词：

命名实体识别 ChineseBERT 多特征网络多尺度卷积软注意力

基金：

天津市教委科研计划项目天津仁爱学院校级科研项目

项目编号：

2021KJ083XX18002

出版年：

2024

DOI：

10.13274/j.cnki.hdzj.2024.08.024

信息技术

黑龙江省信息技术学会中国电子信息产业发展研究院　中国信息产业部电子信息中心

信息技术

CSTPCD

影响因子：0.413

ISSN：1009-2552

年,卷(期)：2024.(8)