结合ChineseBERT和多特征网络的数学命名实体识别
Mathematical named entity recognition based on the combination of ChineseBERT and multi-feature network
白建侠1
作者信息
- 1. 天津仁爱学院数学教学部,天津 301636
- 折叠
摘要
针对基础深度学习模型特征提取能力不足,词向量语义表达不准确等问题,提出了结合Chi-neseBERT 和多特征网络的数学命名实体识别模型.ChineseBERT结合当前词的上下文动态调整向量表示,提高词向量语义表示准确性;多特征网络通过改进的卷积网络和双向简单循环单元同时捕捉字符局部和全局序列特征,软注意力机制识别出对实体识别影响较大的关键特征,由条件随机场输出识别结果.在真实数学数据集进行实验,结果表明该模型F1分数达到了 97.67%,高于近期表现较好的深度学习模型,简单循环单元训练效率更高,证明了模型的有效性.
Abstract
To address the problems of insufficient feature extraction ability and inaccurate semantic expression of word vectors in the basic deep learning model,a mathematical named entity recognition model combining ChineseBERT and multi-feature network is proposed.ChineseBERT combines the context of the current word to dynamically adjust the vector representation and improve the accuracy of the semantic representation of the word vector.The multi-feature network captures the local and global sequence features of characters simultane-ously through the improved convolution network and the bidirectional simple recurrent unit.The soft attention mechanism recognizes the key features that have a great impact on entity recognition,and the recognition re-sults are output by the conditional random field.Experiments on real mathematical data sets show that the F1 score of the model reaches 97.67%,which is higher than the deep learning model with good performance in re-cent years.The training efficiency of simple recurrent unit is higher,which proves the effectiveness of the model.
关键词
命名实体识别/ChineseBERT/多特征网络/多尺度卷积/软注意力Key words
named entity recognition/ChineseBERT/multi-feature network/multi-scale convolution/soft attention引用本文复制引用
基金项目
天津市教委科研计划项目(2021KJ083)
天津仁爱学院校级科研项目(XX18002)
出版年
2024