A fast feature extraction method for high dimensional big data based on GA algorithm and feature self-representation method
In the face of the"curse of dimensionality"and"overfitting"problems brought by high-dimensional data in the fields of text classification and computer vision,this study proposes an unsupervised feature extraction method that combines feature self rep-resentation and greedy algorithms.This method linearly represents each feature with other features to form a feature self representation model,and optimizes it using greedy algorithms.The experimental data shows that the proposed method only takes 0.13 seconds in runtime;In terms of accuracy,the average scores of maximum variance method,principal component analysis method,regularization representation,and unsupervised feature extraction were 67.24%,80.16%,83.48%,and 83.58%,respectively.Obviously,ex-cept for the maximum variance method,this unsupervised feature extraction method performs the best.The experimental results dem-onstrate the effectiveness of the feature extraction method combined with greedy algorithms in reducing time complexity and improving accuracy,providing a new perspective for future unsupervised feature extraction.
greedy algorithmfeature extractionno supervisionhigh dimensional big data