首页|基于改进ID3算法的非结构化大数据分类优化方法

基于改进ID3算法的非结构化大数据分类优化方法

扫码查看
针对非结构化大数据在分类过程中,由于其数据中存在大量的冗余数据,若不能及时清洗大数据中的冗余数据,会降低数据分类精度的问题,提出一种基于改进ID3(Iterative Dichotomiser 3)算法的非结构化大数据分类优化方法。该方法针对非结构化大数据集合中冗余数据多以及维度繁杂的问题,对数据进行清洗处理,并结合有监督辨识矩阵完成数据降维;根据数据降维结果,采用改进ID3算法建立用于数据分类的决策树分类模型,通过该模型对非结构化大数据进行分类处理,从而实现数据的精准分类。实验结果表明,使用该方法对非结构化大数据分类时,分类效果好,精度高。
Optimization Method for Unstructured Big Data Classification Based on Improved ID3 Algorithm
During the classification process of unstructured big data,due to the large amount of redundant data in the data,if the redundant data cannot be cleaned in a timely manner,it will reduce the classification accuracy of the data.In order to effectively improve the effectiveness of data classification,a non structured big data classification optimization method based on the improved ID3(Iterative Dichotomiser 3)algorithm is proposed.This method addresses the problem of excessive redundant data and complex data dimensions in unstructured big data sets.It cleans the data and combines supervised identification matrices to achieve data dimensionality reduction;Based on the results of data dimensionality reduction,an improved ID3 algorithm is used to establish a decision tree classification model for data classification.Through this model,unstructured big data is classified and processed to achieve accurate data classification.The experimental results show that when using this method to classify unstructured big data,the classification effect is good and the accuracy is high.

improve the iterative dichotomiser 3(ID3)algorithmdata cleaningdata dimensionality reductionunstructured big datadata classification methods

唐锴令、郑皓

展开 >

长沙矿冶研究院海洋矿产资源开发利用技术研究所,长沙 410012

改进ID3算法 数据清洗 数据降维 非结构化大数据 数据分类方法

湖南省自然科学基金资助项目

2022JK60058

2024

吉林大学学报(信息科学版)
吉林大学

吉林大学学报(信息科学版)

CSTPCD
影响因子:0.607
ISSN:1671-5896
年,卷(期):2024.42(5)