Research on unbalanced big data classification based on optimized fuzzy C-means algorithm
To solve the classification problem of unbalanced big data,this paper proposes an unbalanced big data classification algorithm based on optimized fuzzy C-means algorithm.Firstly,the C-means fuzzy crossover operator is calculated,the optimization function is defined,and the unbalanced gain of big data is solved.The Spark classification platform is used to determine the value range of condensed fuzzy nearest neighbor values of big data samples,and then the unbalanced threshold vector is defined by the processing method of enlarging the nearest neighbor values,so as to improve the whole classification process and com-plete the design of unbalanced big data classification method based on the optimized fuzzy C-means algo-rithm.The experiment results show that the application of the above classification method can completely separate the sampling length interval of positive example information and negative example information,ef-fectively solve the problem of information sample confusion caused by inaccurate classification of unbalanced big data,and meet the practical application requirements.