基于RF-MIC-PCA的股票趋势预测
Stock trend prediction based on RF-MIC-PCA
马美琛 1林天华 1赵霞1
作者信息
- 1. 河北经贸大学信息技术学院,河北石家庄 050061
- 折叠
摘要
股票因子具有丰富性、相似性等特点,但对其进行趋势预测较难得到良好效果.针对此问题,提出了一种基于RF-MIC-PCA的股票趋势预测算法.首先,利用随机森林(RF)的基尼指数构建因子与类别间的重要性评分规则,剔除低分因子;然后,利用最大信息系数(MIC)构建因子间的相关性评价方法,并融合主成分分析法(PCA)减少因子冗余度;最后,通过随机森林算法预测的分类准确率作为衡量标准,建立基于RF-MIC-PCA的股票趋势预测算法.为验证算法的有效性,从沪深300中选取10只代表性股票进行实验,结果显示RF-MIC-PCA算法在数据集维度降低了20.45%的同时有效提升了算法的预测性能.另外对沪深300、上证50指数进行趋势预测,准确率分别提高了 4.1%和5.0%,验证了算法的普适性,具有一定的实用价值.
Abstract
Stock factors have characteristics such as richness and similarity,it is more difficult to get good results in trend prediction.To solve the problem,RF-MIC-PCA algorithm is proposed.Firstly,the GI of random forest(RF)is used to construct the importance scoring rules between factors and categories to eliminate low-score factors.Then,the maximal information coefficient(MIC)is used to construct a correlation evaluation method between factors,and integrated the principal components analysis(PCA)to reduce the redundancy of factors.Finally,the stock trend prediction algorithm based on RF-MIC-PCA is established by using the classification accuracy of the random forest prediction as a measure.To verify the effectiveness of the algorithm,10 representative stocks from the CSI 300 are selected for the experiment.The results show that the RF-MIC-PCA algorithm effectively improves the prediction performance of the algorithm while reducing the dimension of the data set by 20.45%.In addition,the trend prediction of the CSI 300 and SSE 50 indices improves the accuracy by 4.1%and 5.0%,which verify the universality of the algorithm and have certain practical value.
关键词
股票趋势预测/随机森林/最大信息系数/主成分分析法Key words
Stock trend prediction/Random forest/Maximal information coefficient/Principal component analysis引用本文复制引用
基金项目
河北省自然科学基金(F2021207005)
出版年
2024