首页|基于动态样本选择的概念漂移自适应预测方法

基于动态样本选择的概念漂移自适应预测方法

扫码查看
概念漂移是影响流数据挖掘性能的重要因素,当前主要通过增量更新或重训练模型进行处理,但对已有知识并未充分利用.从综合利用全体样本出发,本文构建了一种基于动态样本选择的概念漂移自适应分类方法.该方法在新样本到来时进行基于局部一致性的漂移检测,在发现漂移发生时去除区域内的噪声样本,当检测到新概念出现时,对历史相似概念进行重用.最后,对区域内不同类别样本进行多代表点归纳,并同步更新预测模型.本文在含有不同漂移类型的合成数据集上进行去噪效果验证,并在真实数据集上进行预测任务.实验结果表明,该方法可以有效去除因概念漂移而形成的漂移噪声,有效提升了预测模型性能,整体预测表现优于流行的概念漂移自适应模型.
Concept Drift Adaptive Prediction Method Based on Dynamic Sample Selection
Concept drift is an important performance factor in stream data mining,mainly handled by incremental up-dating or retraining models,but not fully utilizing existing knowledge.This paper proposed an concept drift adaptive predic-tion method based on dynamic sample selection,starting from the comprehensive use of all samples.The method performs local consistency based drift detection when new samples arrive,removes noisy samples in the region when drift is detected,and reuses historically similar concepts when new concepts are detected.Finally,multi-representative point summarization is performed for different categories of samples in the region,and the prediction model is updated simultaneously.In this pa-per,the denoising effect is verified on synthetic datasets containing different drift types,and the prediction task is performed on the real dataset.The experimental results show that the method can effectively remove the drift noise due to conceptual drift,which effectively improves the performance of the prediction model.The prediction outperforms the popular concept drift adaptive model.

concept driftlocal drift detectionstream datasample selectionsample denoisyadaptive forecast

代劲、李昊、王国胤

展开 >

重庆邮电大学软件学院,重庆 400065

计算智能重庆市重点实验室,重庆 400065

重庆邮电大学计算机学院,重庆 400065

概念漂移 局部漂移检测 流数据 样本选择 样本去噪 自适应预测

国家自然科学基金国家自然科学基金重庆市自然科学基金重庆市自然科学基金

6193600162002037cstc2021jcyjmsxmX0849cstb2023nscq-LZX0006

2024

电子学报
中国电子学会

电子学报

CSTPCD北大核心
影响因子:1.237
ISSN:0372-2112
年,卷(期):2024.52(9)