铁道学报2024,Vol.46Issue(4) :108-118.DOI:10.3969/j.issn.1001-8360.2024.04.012

基于Doc2vec-LightGBM的CBTC车载信号设备故障分类诊断方法

Fault Classification and Diagnosis Method for CBTC On-board Signal Equipment Based on Doc2vec-LightGBM

柴琳果 张景会 上官伟 蔡伯根 李小雨
铁道学报2024,Vol.46Issue(4) :108-118.DOI:10.3969/j.issn.1001-8360.2024.04.012

基于Doc2vec-LightGBM的CBTC车载信号设备故障分类诊断方法

Fault Classification and Diagnosis Method for CBTC On-board Signal Equipment Based on Doc2vec-LightGBM

柴琳果 1张景会 2上官伟 3蔡伯根 3李小雨2
扫码查看

作者信息

  • 1. 北京交通大学电子信息工程学院,北京 100044;北京市轨道交通电磁兼容与卫星导航工程技术研究中心,北京 100044
  • 2. 北京交通大学电子信息工程学院,北京 100044
  • 3. 北京交通大学电子信息工程学院,北京 100044;北京交通大学轨道交通控制与安全国家重点实验室,北京 100044;北京市轨道交通电磁兼容与卫星导航工程技术研究中心,北京 100044
  • 折叠

摘要

车载信号设备是城市轨道交通信号系统的重要组成部分,其运营过程中会产生海量离散化、片段化的日志文本数据.目前,CBTC车载设备故障记录文本仍存在语义不明确、词语冗余的问题,从而造成故障致因溯源难,针对此,提出一种基于Doc2vec-LightGBM的CBTC车载设备故障自动分类诊断方法.首先对故障文本使用Jieba完成文本分词,依据TF-IDF实现分词文本数据的特征提取,并采用Doc2vec训练文本分词向量;其次针对数据不均衡的问题,采用Borderline-SMOTE算法进行少数类文本向量数据的补全泛化;最后,通过训练轻量梯度提升机LightGBM分类器完成故障文本自动分类.采用某信号厂商所记录的1 133条故障文本数据进行分类实验分析,并与支持向量机(SVM)方法对比.实验结果表明,所提方法在分类精确率、召回率上分别为98.2%、97.5%,证明了该故障文本自动分类方法的有效性和优越性.

Abstract

The on-board equipment of communication based train control system(CBTC)is an important part of the ur-ban rail transit signal system.During its operation,a large amount of discrete and fragmented log text data will be gener-ated.At present,problems such as unclear semantics and redundant words in the fault record text of CBTC on-board e-quipment cause difficulty to trace the cause of the fault.In response to this,this paper proposed an automatic classifica-tion and diagnosis method for CBTC on-board equipment faults based on doc2vec-LightGBM.Firstly,based on the use of Jieba to complete text segmentation for fault text,feature extraction of segmented text data was realized according to TF-IDF algorithm,followed by the use of Doc2vec to train the text segmentation vector.Secondly,because of the problem of unbalanced data,the Borderline-SMOTE algorithm was used for the completion and generalization of small category text vector data.Finally,automatic classification of fault text was completed by training Lightgbm classifier.A total of 1 133 pieces of fault text data recorded by a signal manufacturer were used for classification experimental analysis,and com-pared with the support vector machine(SVM)method.The experimental results show that the classification accuracy and recall of the proposed method are 98.2%and 97.5%respectively,proving the effectiveness and superiority of the auto-matic fault text classification method.

关键词

CBTC/车载设备/Doc2vec/LightGBM/故障分类诊断

Key words

CBTC/on-board equipment/Doc2vec/LightGBM/fault classification diagnosis

引用本文复制引用

基金项目

北京市自然科学基金(L211022)

出版年

2024
铁道学报
中国铁道学会

铁道学报

CSTPCD北大核心
影响因子:0.9
ISSN:1001-8360
参考文献量22
段落导航相关论文