During the classification process of unstructured big data,due to the large amount of redundant data in the data,if the redundant data cannot be cleaned in a timely manner,it will reduce the classification accuracy of the data.In order to effectively improve the effectiveness of data classification,a non structured big data classification optimization method based on the improved ID3(Iterative Dichotomiser 3)algorithm is proposed.This method addresses the problem of excessive redundant data and complex data dimensions in unstructured big data sets.It cleans the data and combines supervised identification matrices to achieve data dimensionality reduction;Based on the results of data dimensionality reduction,an improved ID3 algorithm is used to establish a decision tree classification model for data classification.Through this model,unstructured big data is classified and processed to achieve accurate data classification.The experimental results show that when using this method to classify unstructured big data,the classification effect is good and the accuracy is high.
关键词
改进ID3算法/数据清洗/数据降维/非结构化大数据/数据分类方法
Key words
improve the iterative dichotomiser 3(ID3)algorithm/data cleaning/data dimensionality reduction/unstructured big data/data classification methods