Improved K-means Photovoltaic Energy Data Cleaning Method Based on Autoencoder
The development of smart grids has brought about a massive amount of energy data,and data quality is the foundation for tasks such as data value mining.However,during the collection and transmission process of large-scale photovoltaic energy data from multiple sources,it is inevitable to encounter abnormal data,thus requiring data cleaning.Currently,traditional statisti-cal machine learning-based data cleaning models have certain limitations.This paper proposes an improved K-means clustering model based on the Transformer autoencoder structure for energy big data cleaning.It adaptively determines the number of clus-ters using the elbow method and utilizes autoencoder networks to compress and reconstruct data within clusters,thereby detecting and recovering abnormal data.Additionally,the proposed model employs the multi-head attention mechanism of Transformer to learn the relevant features among the data,enhancing the screening capability for abnormal data.Experimental results on a public-ly available photovoltaic power generation dataset demonstrate that,compared to other methods,the proposed model achieves bet-ter performance in detecting abnormal data,with a screening accuracy of over 96%.Moreover,it is capable of recovering abnormal data to a certain extent,providing effective support for the application of energy big data.