Semi-supervised Online Classification Method for Multi-label Data Stream Based on Kernel Extreme Learning Machine
In practical applications,a large amount of streaming data emerges,and it is characterized of high arrival speed,massive volume and dynamic variation.Moreover,the data streams often contain multiple labels but only a small amount of data in the streams is labeled,causing the problems of concept drift and label missing in the multi-label data.To solve these problems,a semi-supervised online classification method for multi-label data stream based on kernel extreme learning machine is proposed in this paper.Firstly,the data stream is divided into k blocks according to the sliding window to tackle the label missing problem in multi-label data stream.A feature similarity matrix and a label similarity matrix are constructed for each piece of data and they are added to the training of kernel extreme learning machine model.An incremental update mechanism is designed to construct a semi-supervised online kernel extreme learning machine to adapt to the characteristics of streaming data.Secondly,to address the issue of the concept drift problem in data stream,the timestamp mechanism is adopted for discarding update.The data size is preset in advance.When the data reaches the specified size,the oldest unlabeled data is discarded and new data is added for updating.Finally,experiments on 10 multi-label datasets demonstrate that the proposed method possesses strong adaptability to the problems of label missing and concept drift,while maintaining good classification performance.
Data Stream ClassificationSemi-supervised ClassificationMulti-label ClassificationKernel Extreme Learning MachineConcept Drift