微型电脑应用2024,Vol.40Issue(2) :106-109.

基于ChineseBERT和多特征协同网络的电力设备缺陷文本分类模型

Text Classification Model of Power Equipment Defects Based on Chinese BERT and Multi Feature Cooperative Network

李瑛 耿军伟 赵留学 陈波
微型电脑应用2024,Vol.40Issue(2) :106-109.

基于ChineseBERT和多特征协同网络的电力设备缺陷文本分类模型

Text Classification Model of Power Equipment Defects Based on Chinese BERT and Multi Feature Cooperative Network

李瑛 1耿军伟 1赵留学 2陈波1
扫码查看

作者信息

  • 1. 国网北京市电力公司,北京 100031;北京电力经济技术研究院有限公司,北京 100055
  • 2. 国网北京市电力公司,北京 100031
  • 折叠

摘要

针对传统模型特征提取不够全面,词向量语义表达不准确等问题,提出了结合ChineseBERT和多特征协同网络的电力设备缺陷文本分类模型.采用针对汉字优化的ChineseBERT模型提取文本向量表征,提高词向量语义表示的准确性.多特征协同网络全面捕捉缺陷文本局部和上下文语义特征.软注意力机制提升模型聚焦于关键特征的能力.在真实电力设备缺陷文本数据集开展实验,结果表明该模型分类性能优于近期表现较好的深度学习模型,F1分数高达96.82%,证明了模型的有效性.

Abstract

To address the problems of incomplete feature extraction and inaccurate semantic expression of word vector in tradi-tional models,a text classification model of power equipment defects based on ChineseBERT and multi feature collaborative network is proposed.ChineseBERT model optimized for Chinese characters is used to extract text vector representation to im-prove the accuracy of word vector semantic representation.Multi feature collaborative network comprehensively captures the local and contextual semantic features of defective text.The soft attention mechanism improves the ability of the model to focus on key features.Experiments on real power equipment defect text data show that the classification performance of the model is better than the recent deep learning model,and the F1 score is as high as 96.82%,which proves the effectiveness of the mod-el.

关键词

文本分类/ChineseBERT/多特征协同/软注意力

Key words

text classification/ChineseBERT/multi feature collaboration/soft attention

引用本文复制引用

基金项目

国家电网北京市电力公司科技项目(520234220001)

出版年

2024
微型电脑应用
上海市微型电脑应用学会

微型电脑应用

CSTPCD
影响因子:0.359
ISSN:1007-757X
参考文献量12
段落导航相关论文