自动化与仪器仪表2024,Issue(5) :59-63.DOI:10.14016/j.cnki.1001-9227.2024.05.059

基于数据挖掘和机器学习的智能数据快速检索算法分析

Intelligent data fast retrieval algorithm analysis based on data mining and machine learning

钟保强 谭毅恺 何倩 董天波 魏莱
自动化与仪器仪表2024,Issue(5) :59-63.DOI:10.14016/j.cnki.1001-9227.2024.05.059

基于数据挖掘和机器学习的智能数据快速检索算法分析

Intelligent data fast retrieval algorithm analysis based on data mining and machine learning

钟保强 1谭毅恺 1何倩 1董天波 1魏莱1
扫码查看

作者信息

  • 1. 南方电网调峰调频发电有限公司,广州 510000
  • 折叠

摘要

在大数据时代来临之际,如何针对海量数据进行快速检索是一道难题.此次研究融合数据挖掘和机器学习算法,旨在解决大数据时代海量数据快速检索的难题,提高检索效率和准确性,为大数据处理和分析提供有力的支持.首先,此次研究对融合数据挖掘和机器学习的决策树算法进行改进,然后构建优化后的模型并进行实际应用分析,再对改进的决策树算法进行性能对比,结果表明,当信息检索条件个数为6时,改进决策树算法的准确率及召回率均最高均为93%,比对比算法ID3高出3%,最后对改进决策树算法的智能数据快速检索模型进行实证分析,结果显示,此次研究所提模型在0~300个文件中检索时间最快,为10 ms.由此可知,改进决策树算法的数据快速检索模型,可以实现高性能的数据挖掘和检索功能,并为相关领域的应用提供了有力的支持.

Abstract

With the advent of the era of big data,how to quickly retrieve massive data is a difficult problem.This research in-tegrates data mining and machine learning algorithms,aiming to solve the difficult problem of rapid retrieval of massive data in the era of big data,improve the efficiency and accuracy of retrieval,and provide strong support for big data processing and analysis.First of all,this study improved the decision tree algorithm that integrates data mining and machine learning,then constructed the optimized model and conducted practical application analysis,and then compared the performance of the improved decision tree algorithm.The results show that when the number of information retrieval conditions is 6,the accuracy rate and recall rate of the improved decision tree algorithm are the highest and both are 93%.Finally,the empirical analysis of the intelligent data fast retrieval model of the im-proved decision tree algorithm shows that the retrieval time of the proposed model in this study is the fastest among 0-300 files,which is 10ms.It can be seen that the improved decision tree algorithm can achieve high performance data mining and retrieval functions,and provide strong support for the application of related fields.

关键词

数据挖掘/机器学习/数据快速检索/特征选择

Key words

data mining/machine learning/fast data retrieval/feature selection

引用本文复制引用

基金项目

中国南方电网调峰调频发电有限公司信息化项目(022100HK42210009)

出版年

2024
自动化与仪器仪表
重庆工业自动化仪表研究所,重庆市自动化与仪器仪表学会

自动化与仪器仪表

CSTPCD
影响因子:0.327
ISSN:1001-9227
参考文献量11
段落导航相关论文