Study on Nondestructive Mining Algorithm for Hybrid Big Data Under Collaborative Filtering
Due to the high similarity between massive data,the data mining process is vulnerable to redundant in-terference,leading to data loss and data damage.Therefore,a lossless method of mining mixed big data based on col-laborative filtering algorithm was presented.Firstly,the mixed big data were integrated,and the redundancy was re-moved.And then,the same data from different sources were integrated without loss.Moreover,the time decay function based on the collaborative filtering algorithm was used to calculate the similarity between mining items.Under the constraint of feature association degree of mixed big data,lossless mining for mixed big data was realized.Experimen-tal results prove that when the mixed big data reaches 25000 MB,the time required for data mining is only about 45 ms,and the mining accuracy is more than 95%.The data mining results are consistent with the expected value.
Collaborative filtering algorithmMixed big dataLossless miningData cleaningData integration