To achieve resident groups classification based on social media big data,labeled latent Dirichlet allocation(Labeled LDA)is introduced from the field of natural language processing(NLP).Based on Twitter check-in data from Chicago in 2014,LDA exploratory analysis is used to extract prior information.Constructing a Labeled LDA model,urban residents are ultimately classified into five groups:office workers,college students and faculty,primary and secondary school students and faculty,munici-pal staff,and others.The experimental results indicate that Labeled LDA achieves a recognition precision of 0.92,surpassing that of the support vector machine(SVM)(0.87).This algorithm effectively achieves resident groups classification,thereby promoting the development of targeted services.
关键词
标签潜在狄利克雷分布(Labeled/LDA)/Twitter签到数据/居民群体分类/NLP算法
Key words
Labeled Latent Dirichlet allocation(Labeled LDA)/Twitter check-in data/resident groups classification/NLP algorithm