Streaming Cleaning System for Periodic Industrial Time Series Data
To efficiently clean industrial time series with the characteristics of periodicity,a streaming data cleaning system was first designed using distributed components.The system employs Mosquitto for data gathering,Flume for connection,and Kafka for the buffer,which provides benefits of high throughput and a large buffer.The data cleaning component serves as the core of the system.Then,a periodic time series cleaning algorithm was proposed based on a constraint model.Integrating the characteristics of temporality,periodicity,and physical meaning,the methods of periodic detection and data slicing were added to the original speed constraint algorithm,so as to solve the distortion problem of the original algorithm and improve the availability to deal with periodic data.Finally,the effectiveness of the system and the improved algorithm was verified using a tunnel boring machine data set as a case study.
data cleaningindustrial big datatime series dataspeed constraintperiodic