计算机工程与应用2015,Issue(5) :126-131.DOI:10.3778/j.issn.1002-8331.1401-0264

随机森林中树的数量

Number of trees in random forest

刘敏 郎荣玲 曹永斌
计算机工程与应用2015,Issue(5) :126-131.DOI:10.3778/j.issn.1002-8331.1401-0264

随机森林中树的数量

Number of trees in random forest

刘敏 1郎荣玲 1曹永斌1
扫码查看

作者信息

  • 1. 北京航空航天大学 电子信息工程学院,北京 100191
  • 折叠

摘要

随机森林是一种集成分类器,对影响随机森林性能的参数进行了分析,结果表明随机森林中树的数量对随机森林的性能影响至关重要。对树的数量的确定方法以及随机森林性能指标的评价方法进行了研究与总结。以分类精度为评价方法,利用UCI数据集对随机森林中决策树的数量与数据集的关系进行了实验分析,实验结果表明对于多数数据集,当树的数量为100时,就可以使分类精度达到要求。将随机森林和分类性能优越的支持向量机在精度方面进行了对比,实验结果表明随机森林的分类性能可以与支持向量机相媲美。

Abstract

Random Forest(RF)is a kind of ensemble classifier. This paper analyses the parameters influencing the per-formance of RF, and the result shows that the number of trees in random forest has significant effect on its performance. This paper carries on a research and summary on the method of determining the number of trees and evaluating the perfor-mance index of RF, with the classification accuracy used as the evaluation method, utilizing UCI data sets, an experimental analysis on the relationship between the number of decision trees in random forest and the data sets has been done. The experimental result shows that for the majority of data sets, when the number of trees is 100, the classification accuracy can meet the requirement. This paper compares RF with support vector machine having superior classification perfor-mance in the aspect of accuracy, and the result shows that the classification performance of random forest is similar to that of support vector machine.

关键词

随机森林/支持向量机/分类精度

Key words

random forest/support vector machine/classification accuracy

引用本文复制引用

基金项目

国家自然科学基金(61202078)

国家高技术研究发展计划(863)(2011AA110101)

国家高技术研究发展计划(863)(2012AA121801)

出版年

2015
计算机工程与应用
华北计算技术研究所

计算机工程与应用

CSTPCDCSCD北大核心
影响因子:0.683
ISSN:1002-8331
被引量55
参考文献量24
段落导航相关论文