首页|基于改进孤立森林算法的风电机组异常数据清洗

基于改进孤立森林算法的风电机组异常数据清洗

扫码查看
风电机组的风速、功率数据是衡量风电机组正常运行的关键参数,然而其中包含大量的异常数据,需要进行清洗.提出了 一种改进的孤立森林算法,先使用四分位法确定孤立森林正常数据评分与异常数据评分的分界线,再划分风速区间改变边缘数据的异常性,最后使用最小二乘法曲线拟合去误差去除小概率离散型和小概率堆积型异常数据的改进方法来对风速、功率的异常数据进行清洗.结果表明:与传统的孤立森林算法相比,改进的孤立森林算法能够正确界定正常数据评分与异常数据评分的分界线,可以去除堆积型异常数据,且对于数据主带边缘的离散型异常数据具有更好的清洗效果.
Wind Turbine Abnormal Data Cleaning Based on an Improved Isolation Forest Algorithm
The wind speed and power data of wind turbines are key parameters to measure the normal operation status of wind tur-bines.However,a large amount of abnormal data are contained and need to be cleared.An improved isolation forest algorithm was pro-posed.Firstly,the quartile method was used to determine the dividing line between the normal data scoring and the abnormal data scoring of the isolated forest.Secondly,the wind speed interval was divided to change the abnormality of the edge data.Finally,the improved method of least square curve fitting to remove small probability discrete and small probability stacked abnormal data was used to clean the abnormal data of wind speed and power.The results show that compared with the traditional isolated forest algorithm,the improved isola-ted forest algorithm can correctly define the dividing line between the normal data score and the abnormal data score,can remove the ac-cumulated abnormal data,and has a better cleaning effect on the discrete abnormal data at the edge of the data main band.

wind turbineisolated forestabnormal dataquartile method

魏泰、贺少雄、胡子武、曹立新

展开 >

甘肃省特种设备检验检测研究院,兰州 730050

兰州理工大学机电工程学院,兰州 730050

风电机组 孤立森林 异常数据 四分位法

国家市场监督管理总局科技项目

2022MK125

2024

科学技术与工程
中国技术经济学会

科学技术与工程

CSTPCD北大核心
影响因子:0.338
ISSN:1671-1815
年,卷(期):2024.24(9)
  • 16