基于混合采样的设备健康状态预测方法研究
Research on Health State Prediction Method Based on Mixed Sampling
刘家亦 1安平凯 2刘志勇3
作者信息
- 1. 对外经济贸易大学英语学院,北京 100029
- 2. 石家庄工商职业学院工学院,河北石家庄 050000
- 3. 河北工业职业技术大学,河北石家庄 050091
- 折叠
摘要
针对因设备健康状态样本数据不均衡严重影响对健康状态预测效果的问题,提出基于混合采样实现数据均衡、改善预测效果的思路,设计了基于混合采样方法的样本数据平衡流程.通过采用Borderline-SMOTE算法补充少数类样本数量,利用改进K-means算法对多数类样本进行删除,将冗余数据剔除后,形成较为均衡的数据集提供给分类器.实验数据显示,无论是对数据进行欠采样还是过采样,均可提升评价指标AUC和G-mean;采用混合采样对数据进行平衡,评价指标改善更加明显.结果表明,本方法可以明显提升设备健康状态的预测效果,对装备管理部门实现精准维修具有重要的参考价值.
Abstract
In response to the serious impact of imbalanced device health status sample data on health status prediction,a mixed sampling based approach is proposed to achieve data balance and improve prediction performance.A sample data balance process based on the mixed sampling method is designed.By using the Borderline SMOTE algorithm to supplement the number of minority class samples,and using the improved K-means algorithm to delete the majority class samples,after removing redundant data,a relatively balanced dataset is formed and provided to the classifier.The experimental data shows that both under sampling and Oversampling can improve the evaluation indicators AUC and G-mean;Using a mixed approach to balance the data and improve the evaluation indicators more significantly.The results indicate that this method can significantly improve the prediction effect of equipment health status,and has important reference value for equipment management departments to achieve precise maintenance.
关键词
数据不均衡/过采样/欠采样/混合采样/健康状态预测Key words
imbalanced data/oversampling/under sampling/mixed sampling/health status prediction引用本文复制引用
基金项目
河北省高等学校科学技术研究项目(ZD2021312)
出版年
2024