首页|基于熵和置信度的非平衡问题欠采样Boosting框架

基于熵和置信度的非平衡问题欠采样Boosting框架

扫码查看
为了解决传统方法中存在的边界过拟合、泛化性能差、重要信息丢失问题,提出一种基于熵和置信度的非平衡问题欠采样Boosting框架。将动态重采样方法与Boosting集成在一起,以解决边界过拟合问题,提高泛化性能;在Ecuboost中使用置信度和熵作为基准,以保证欠采样过程中大多数样本的有效性和结构分布,提出的基于置信度的Boosting框架使动态采样方法进一步提升方法的泛化能力。用两个大型数据集上的实验对比结果验证了该方法的有效性。
UNDERSAMPLING BOOSTING FRAMEWORK FOR UNBALANCED PROBLEMS BASED ON ENTROPY AND CONFIDENCE
In order to solve the problems of boundary over fitting,poor generalization performance and important information loss in traditional methods,an under sampling boosting framework for unbalanced problems based on entropy and confidence is proposed.The dynamic up-sampling method was integrated with boosting to solve the boundary over fitting problem and improve the generalization performance.The confidence and entropy were used as the benchmark to ensure the validity and structure distribution of most samples in the process of under sampling.In addition,the generalization ability of dynamic sampling method was improved by the proposed boosting framework based on confidence degree further.Experimental results on two large data sets show the effectiveness of the proposed method.

EntropyConfidenceUndersamplingUnbalance

冯本勇、徐勇军

展开 >

石家庄工商职业学院 河北石家庄 050000

中国科学院计算技术研究所 北京 100190

置信度 欠采样 不平衡

国家自然科学基金项目

61702487

2024

计算机应用与软件
上海市计算技术研究所 上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心
影响因子:0.615
ISSN:1000-386X
年,卷(期):2024.41(1)
  • 17