基于随机森林算法的科研项目经费计算方法评价分析
Evaluation and Analysis of Research Project Funding Calculation Method Based on Random Forest Algorithm
黄琰 1李士杰 2谷裕3
作者信息
- 1. 南方电网能源发展研究院有限责任公司,广东广州 510530
- 2. 中国南方电网有限责任公司,广东广州 510663
- 3. 中国南方电网超高压输电公司,广东广州 510663
- 折叠
摘要
通过收集来自电网项目的上千份经费数据,挖掘科研跨度和科技研发价值的特征因素,构建科技项目样本集.使用实体嵌入的方式,将离散变量向量化,再基于随机森林,通过迭代计算测定科技研发价值与科研项目金额的关联规律性,形成能反映科技研发价值的科研项目经费计列方法.研究结果表明,随机森林在预测科技研发价值与科研项目金额关联方面具有显著优势.通过构建多棵决策树,减少单一模型的过拟合风险,提升预测的准确性和稳定性.在此基础上评估各特征变量的重要性,能够有效识别对科研经费预测影响较大的因素,从而为科研经费分配提供科学依据.
Abstract
By analyzing the funding data of more than one thousand projects,we explored the characteristic factors of research span and the value of scientific research and development,constructing a sample set of scientific projects.Using entity embedding to vectorize discrete variables,we then applied random forest methods,iteratively calculating to determine the correlation between the value of scientific research and development and the amount of funding for research projects.This led to the formation of a method for allocating research project funds that reflects the value of scientific research and development.The study results indicate that the random forest approach has significant advantages in predicting the correlation between the value of scientific research and development and the amount of funding for research projects.By constructing multiple decision trees,the risk of overfitting in a single model is reduced,improving the accuracy and stability of the predictions.Additionally,assessing the importance of various characteristic variables effectively identifies factors that have a substantial impact on the prediction of research funding,thus providing a scientific basis for the allocation of research funds.
关键词
随机森林/实体嵌入/科研项目金额/预测Key words
random forest/entity embedding/scientific research funding/prediction引用本文复制引用
出版年
2024