Distributed Principal Component Analysis for Massive Data and Its Application in Measurement of Common Prosperity
Based on the two-wheeled method of distributed PCA algorithm(TR-DPCA),each local machine calculates the sum vectors which transmits them to the central machine to calculate the mean vector of the whole sample data for each local machine.And then,each local machine calculates the divergence matrix and transmits them to the central machine to calculate the covariance matrix of the full sample data.Finally,the feature vectors are obtained by feature decomposition according to the covariance matrix.Through numerical simulation,it is found that the performance of the TR-DPCA algorithm is consistent with that of the full-sample PCA,and is better than that of the distributed PCA algorithm based on the single-wheel method.In addition,the application of TR-DPCA algorithm to the measurement of China's common prosperity shows that the level of China's common prosperity is on the rise,and the individual gap is narrowing.
principal component analysismassive datadistributedtwo-round methodcommon prosperity