首页|基于BERT的短文本分类模型及在铁路CIR设备故障诊断中的应用

基于BERT的短文本分类模型及在铁路CIR设备故障诊断中的应用

扫码查看
在设备故障诊断领域,操作说明、维修记录等文本数据具有极大的应用价值,充分挖掘和利用这类数据能大幅度提升故障诊断的工作效率.现有研究常用语义特征抽取及无监督聚类方法挖掘文本数据,辅助进行故障定位,但这类方法通常无法解释故障原因和给出提供相应维修方案的理由,据此生成的故障维修方案不易于理解.文章基于现有的成熟预训练语言模型 BERT(bidirectional encoder representation from transformers),提出了一种基于BERT的短文本分类模型和知识图谱结合的故障定位方法,以充分挖掘和利用铁路CIR设备的文本数据中蕴含的知识和规律.所用方法首先基于CIR设备的功能层次关系确定故障模块,然后借助基于BERT的文本分类技术实现故障的初步定位,最后结合知识图谱进一步确定故障原因等信息辅助进行故障诊断,基于知识图谱积累的故障诊断知识提供故障维修方案易于维修人员理解,有助于知识的管理和工程效率的提升.在文本分类技术方面,文章利用铁路CIR设备故障维修台账记录数据进行实验,实验结果证明,基于BERT的短文本分类模型相较传统分类模型在性能上有较大的提升;在故障诊断方面,文章提出的文本分类和知识图谱结合的故障定位方法为经验相对不足的设备维护人员进行快速故障诊断提供了支持,也具备一定的实践意义.
BERT-Based Short Text Classification Model and Its Application in Fault Diagnosis of CIR Equipment
In the field of equipment fault diagnosis,text data such as operation in-structions and maintenance records have great application value,thus fully mining and utilizing text data can significantly improve the efficiency of fault diagnosis.Se-mantic feature extraction and unsupervised clustering methods are commonly used to mine text data for the purpose of assisting in fault location,but such methods are not able to explain the cause of faults and give reasons for providing correspond-ing repair solutions.Furthermore,repair solutions generated by those methods are not easy to understand.Based on the existing mature pre-trained language model BERT(bidirectional encoder representation from transformers),this paper proposed a BERT-based short text classification model combined with knowledge graph for fault location,in order to fully explore and utilize the knowledge and laws contained in the text data of CIR equipment.Firstly,fault modules were determined by functional hierarchical relationships of CIR equipment.Then,this paper used BERT-based text classification model to obtain the preliminary fault location.Finally,causes and other information were further recognized with the assistance of knowledge graph to assist in fault diagnosis.Proving fault repair solutions based on the fault diagnosis knowl-edge accumulated by the knowledge graph makes solutions easy to be understood by maintenance personnel,and helps in knowledge management and engineering effi-ciency.In terms of text classification techniques,this paper used fault maintenance ledger records of CIR equipment to do experiments,and results proved that the performance of our BERT-based model had been greatly improved compared with traditional classification models.In terms of fault diagnosis,the proposed fault loca-tion method combining text classification and knowledge graph also provided support for rapid fault diagnosis by inexperienced equipment maintenance personnel,as well as obtaining certain practical significance.

Text classificationBERTCIR equipmentfault diagnosis

张奕林、叶含瑞、张玲玲、薛倚明

展开 >

中国科学院大学经济与管理学院,北京 100190

中国科学院大学数字经济监测预测预警与政策仿真教育部哲学社会科学实验室,北京 100190

香港中文大学(深圳)数据科学学院,深圳 518172

中国科学院大数据挖掘与知识管理重点实验室,北京 100190

展开 >

文本分类 BERT CIR设备 故障诊断

国家自然科学基金面上项目中国科学院大学数字经济监测预测预警与政策仿真教育部哲学社会科学实验室(培育)项目

72071194E2810801

2024

系统科学与数学
中国科学院数学与系统科学研究院

系统科学与数学

CSTPCD北大核心
影响因子:0.425
ISSN:1000-0577
年,卷(期):2024.44(1)
  • 33