The 2023 Guangdong University Student Computer Design Competition Big Data Special Competition provided nine data collected from the National Bureau of Statistics.This paper analyzes the urban and rural construction development situation in various provinces and cities based on 20 types of indicators included in the nine data.This paper uses the K-means and GMM clustering model in Machine Learning of Python language respectively,and combines three ways of PCA,constructing autoencoder and constructing feature engineering to carry out feature dimensionality reduction.Big Data analysis is conducted on 31 provinces,cities and autonomous regions in mainland China,and provinces and cities are clustered into developed,moderate and average three categories according to the degree of urban and rural construction development.The results of clustering model analysis are highly consistent with the actual reality of urban and rural development in China.
Big Dataurban and rural developmentclusteringK-meansGMM