融合GA算法与特征自表示方法的高维大数据特征快速提取方法研究

扫码查看

原文链接

万方数据
维普

中文摘要：面对文本分类和计算机视觉领域高维数据带来的"维数灾难"和"过拟合"问题,提出了一种融合特征自表示和贪婪算法的无监督特征提取方法.该方法将每个特征与其他特征进行线性表示,形成一个特征自表示模型,并利用贪婪算法进行优化.实验数据显示,研究提出的方法在运行时间中,耗时仅需0.13 s;在准确度方面,最大方差法、主成分分析法、正则化表示和无监督特征提取的平均得分分别为67.24％、80.16％、83.48％、83.58％.显然,除了最大方差法外,该无监督特征提取方法的表现均为最优.实验结果证明了该特征提取方法结合贪婪算法的方法在降低时间复杂度和提高准确率方面的有效性,为未来的无监督特征提取提供了新的视角.

外文标题：A fast feature extraction method for high dimensional big data based on GA algorithm and feature self-representation method

外文摘要：In the face of the"curse of dimensionality"and"overfitting"problems brought by high-dimensional data in the fields of text classification and computer vision,this study proposes an unsupervised feature extraction method that combines feature self rep-resentation and greedy algorithms.This method linearly represents each feature with other features to form a feature self representation model,and optimizes it using greedy algorithms.The experimental data shows that the proposed method only takes 0.13 seconds in runtime;In terms of accuracy,the average scores of maximum variance method,principal component analysis method,regularization representation,and unsupervised feature extraction were 67.24％,80.16％,83.48％,and 83.58％,respectively.Obviously,ex-cept for the maximum variance method,this unsupervised feature extraction method performs the best.The experimental results dem-onstrate the effectiveness of the feature extraction method combined with greedy algorithms in reducing time complexity and improving accuracy,providing a new perspective for future unsupervised feature extraction.

外文关键词：

greedy algorithmfeature extractionno supervisionhigh dimensional big data

作者：

张德发

展开 >

作者单位：

四川工商学院,四川隆昌 617400

关键词：

贪婪算法特征提取无监督高维大数据

出版年：

2024

DOI：

10.14016/j.cnki.1001-9227.2024.01.026

自动化与仪器仪表

重庆工业自动化仪表研究所,重庆市自动化与仪器仪表学会

自动化与仪器仪表

CSTPCD

影响因子：0.327

ISSN：1001-9227

年,卷(期)：2024.(1)

参考文献量11