首页|基于随机森林的多源小样本数据快速集成方法

基于随机森林的多源小样本数据快速集成方法

扫码查看
受多源小样本数据属性复杂性的影响,对其进行集成处理时,过拟合和欠拟合情况较为明显.为此,文章提出基于随机森林的多源小样本数据快速集成方法.考虑多源小样本数据自身的属性特征,在构建随机森林模型阶段,充分利用粒向量与多源小样本数据特征的贴合性,将其作为随机森林的基础结构,利用粒化层归一化多源小样本数据,并将输出的粒化结果作为决策层的节点.在集成阶段,根据多源小样本数据与决策层节点之间的距离,集成数据.在测试结果中,数据集成的过拟合情况占比仅为0.29%,欠拟合情况占比也仅为0.27%,具有良好的集成效果.
Random Forest Based Fast Integration of Multi-Source Small Sample Data
Influenced by the attribute complexity of multi-source small sample data,the overfitting and underfitting are obvious.Therefore,the rapid integration method of multi-source small sample data based on random forest is proposed.Considering the properties of multi-source small sample data itself,in the construction of the random forest model stage,make full use of the fit of particle vector and small sample data features,as the basis of the random forest,using the granulation layer of multi-source small sample data normalization operation,and the output granulation results as a decision-making node.In the integration stage,the integration of the data is realized according to the distance between the multi-source small sample data and the nodes at the decision level.In the test results,the proportion of overfitting of data integration was only 0.29%,and the proportion of underfitting was only 0.27%,which had good integration effect.

random forestmulti-source small sample datafast integrationattribute characteristicsrandom forest model

何昀、张川、张继夫、陈伟

展开 >

空军航空大学,吉林长春 130021

随机森林 多源小样本数据 快速集成 属性特征 随机森林模型

2024

信息与电脑
北京电子控股有限责任公司

信息与电脑

影响因子:1.143
ISSN:1003-9767
年,卷(期):2024.36(1)
  • 9