基于RFECV特征选择和随机森林预测模型的应用与优化
Feature Selection Based on RFECV and Application and Optimization of Random Forest Prediction Model
孙晶1
作者信息
- 1. 太原科技大学计算机科学与技术学院,山西 太原 030000
- 折叠
摘要
该文基于随机森林预测模型,提出RFECV特征选择方法:首先对特征变量进行独热编码,再利用RFECV内置的交叉验证评估各特征子集性能,以确定最佳特征数量,并递归消除低重要性特征.实验表明,该方法在随机森林上训练与预测更快,均方误差更低,特征提取准确率高.
Abstract
Based on the random forest prediction model,this paper proposes the RFECV feature selection method:firstly,the feature variables are encoded with one-hot encoding,and then the built-in cross-validation of RFECV is used to evaluate the performance of each feature subset to determine the optimal number of features,and recursively eliminate low-importance features.Experiments show that this method achieves faster training and prediction on the random forest,lower mean squared error,and high accuracy in feature extraction.
关键词
随机森林预测模型/独热编码/递归特征消除/交叉验证Key words
random forest prediction model/one-hot encoding/recursive feature elimination/cross-validation引用本文复制引用
出版年
2024