首页|基于PCA的大数据降维应用

基于PCA的大数据降维应用

扫码查看
随着互联网和信息技术的飞速发展,数据源的广泛性和复杂性给获取信息准确性带来了巨大的挑战.采用合适的降维算法可以把这些海量数据从高维降低到可以接受的范围、且不失去原数据表达的含义,而计算量大大降低,更容易理解.PCA(principal component analysis)即主成份分析作为数据降维的重要算法之一,利用正交变换,把一组相关的变量转化为一组线性不相关的变量,通常这种变换会减少变量个数,计算各成份在表达数据的贡献度,选取排列最前的贡献最高的几个特征即可表达整个数据集.实验表明,主成份从多维降至二维即可表达整个数据集,在精度可控的范围内使计算量大大降低.
Dimensionality Reduction Application of Big Data Based on PCA
With the rapid development of the Internet and information technology,the extensiveness and complexity of data sources have brought great challenges to obtaining information accuracy.Using a suitable dimen-sionality reduction algorithm can reduce these massive data from high dimensionality to an acceptable range without losing the meaning expressed by the original data,and the amount of calculation is greatly reduced,making it easier to understand.PCA(principal component analysis)is principal component analysis as one of the important algorithms for data dimensionality reduction.It uses orthogonal transformation to convert a set of related variables into a set of linearly uncorrelated variables.Usually,this transformation will reduce the number of variables,calculate the contribu-tion of each component in the expression data,and select the top features with the highest contribution to express the entire data set.Experiments show that the entire data set can be expressed by reducing the principal components from multi-dimensional to two-dimensional,which greatly reduces the amount of calculation within the range of controllable accuracy.

Big data dimensionality reductionDimensionality reduction visualizationArtificial intelligenceIn-telligent recommendationArtificial intelligence and intelligent manufacturing

郭尚志、廖晓峰、李刚、唐玉玲

展开 >

重庆大学计算机学院,重庆 400030

湖南科创信息技术股份有限公司,湖南 长沙 410205

大数据降维 降维可视化 人工智能 智能推荐 人工智能与智能制造

2024

计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
年,卷(期):2024.41(5)
  • 4