摘要
在大数据时代来临之际,如何针对海量数据进行快速检索是一道难题.此次研究融合数据挖掘和机器学习算法,旨在解决大数据时代海量数据快速检索的难题,提高检索效率和准确性,为大数据处理和分析提供有力的支持.首先,此次研究对融合数据挖掘和机器学习的决策树算法进行改进,然后构建优化后的模型并进行实际应用分析,再对改进的决策树算法进行性能对比,结果表明,当信息检索条件个数为6时,改进决策树算法的准确率及召回率均最高均为93%,比对比算法ID3高出3%,最后对改进决策树算法的智能数据快速检索模型进行实证分析,结果显示,此次研究所提模型在0~300个文件中检索时间最快,为10 ms.由此可知,改进决策树算法的数据快速检索模型,可以实现高性能的数据挖掘和检索功能,并为相关领域的应用提供了有力的支持.
Abstract
With the advent of the era of big data,how to quickly retrieve massive data is a difficult problem.This research in-tegrates data mining and machine learning algorithms,aiming to solve the difficult problem of rapid retrieval of massive data in the era of big data,improve the efficiency and accuracy of retrieval,and provide strong support for big data processing and analysis.First of all,this study improved the decision tree algorithm that integrates data mining and machine learning,then constructed the optimized model and conducted practical application analysis,and then compared the performance of the improved decision tree algorithm.The results show that when the number of information retrieval conditions is 6,the accuracy rate and recall rate of the improved decision tree algorithm are the highest and both are 93%.Finally,the empirical analysis of the intelligent data fast retrieval model of the im-proved decision tree algorithm shows that the retrieval time of the proposed model in this study is the fastest among 0-300 files,which is 10ms.It can be seen that the improved decision tree algorithm can achieve high performance data mining and retrieval functions,and provide strong support for the application of related fields.
基金项目
中国南方电网调峰调频发电有限公司信息化项目(022100HK42210009)