首页|融合GA算法与特征自表示方法的高维大数据特征快速提取方法研究

融合GA算法与特征自表示方法的高维大数据特征快速提取方法研究

扫码查看
面对文本分类和计算机视觉领域高维数据带来的"维数灾难"和"过拟合"问题,提出了一种融合特征自表示和贪婪算法的无监督特征提取方法.该方法将每个特征与其他特征进行线性表示,形成一个特征自表示模型,并利用贪婪算法进行优化.实验数据显示,研究提出的方法在运行时间中,耗时仅需0.13 s;在准确度方面,最大方差法、主成分分析法、正则化表示和无监督特征提取的平均得分分别为67.24%、80.16%、83.48%、83.58%.显然,除了最大方差法外,该无监督特征提取方法的表现均为最优.实验结果证明了该特征提取方法结合贪婪算法的方法在降低时间复杂度和提高准确率方面的有效性,为未来的无监督特征提取提供了新的视角.
A fast feature extraction method for high dimensional big data based on GA algorithm and feature self-representation method
In the face of the"curse of dimensionality"and"overfitting"problems brought by high-dimensional data in the fields of text classification and computer vision,this study proposes an unsupervised feature extraction method that combines feature self rep-resentation and greedy algorithms.This method linearly represents each feature with other features to form a feature self representation model,and optimizes it using greedy algorithms.The experimental data shows that the proposed method only takes 0.13 seconds in runtime;In terms of accuracy,the average scores of maximum variance method,principal component analysis method,regularization representation,and unsupervised feature extraction were 67.24%,80.16%,83.48%,and 83.58%,respectively.Obviously,ex-cept for the maximum variance method,this unsupervised feature extraction method performs the best.The experimental results dem-onstrate the effectiveness of the feature extraction method combined with greedy algorithms in reducing time complexity and improving accuracy,providing a new perspective for future unsupervised feature extraction.

greedy algorithmfeature extractionno supervisionhigh dimensional big data

张德发

展开 >

四川工商学院,四川隆昌 617400

贪婪算法 特征提取 无监督 高维大数据

2024

自动化与仪器仪表
重庆工业自动化仪表研究所,重庆市自动化与仪器仪表学会

自动化与仪器仪表

CSTPCD
影响因子:0.327
ISSN:1001-9227
年,卷(期):2024.(1)
  • 11