Random Forest Based Fast Integration of Multi-Source Small Sample Data
Influenced by the attribute complexity of multi-source small sample data,the overfitting and underfitting are obvious.Therefore,the rapid integration method of multi-source small sample data based on random forest is proposed.Considering the properties of multi-source small sample data itself,in the construction of the random forest model stage,make full use of the fit of particle vector and small sample data features,as the basis of the random forest,using the granulation layer of multi-source small sample data normalization operation,and the output granulation results as a decision-making node.In the integration stage,the integration of the data is realized according to the distance between the multi-source small sample data and the nodes at the decision level.In the test results,the proportion of overfitting of data integration was only 0.29%,and the proportion of underfitting was only 0.27%,which had good integration effect.
random forestmulti-source small sample datafast integrationattribute characteristicsrandom forest model