首页|基于Twitter签到数据的城市居民群体分类算法研究

基于Twitter签到数据的城市居民群体分类算法研究

扫码查看
为实现基于社交媒体大数据的居民群体分类,引入自然语言处理(NLP)领域的标签潜在狄利克雷分布(Labeled LDA)模型.基于2014年芝加哥市的Twitter签到数据,使用LDA探索性分析提取先验信息.构建Labeled LDA,将城市居民分为五类:上班族、大学生及高校教职工、中小学生及教职工、市政工作人员和其他.实验结果表明,Labeled LDA的分类精度达到0.92,超过了支持向量机(SVM)0.87的分类精度.该算法有效地实现了居民群体分类,从而促进有针对性的服务制定.
Research on urban resident groups classification algorithm based on Twitter check-in data
To achieve resident groups classification based on social media big data,labeled latent Dirichlet allocation(Labeled LDA)is introduced from the field of natural language processing(NLP).Based on Twitter check-in data from Chicago in 2014,LDA exploratory analysis is used to extract prior information.Constructing a Labeled LDA model,urban residents are ultimately classified into five groups:office workers,college students and faculty,primary and secondary school students and faculty,munici-pal staff,and others.The experimental results indicate that Labeled LDA achieves a recognition precision of 0.92,surpassing that of the support vector machine(SVM)(0.87).This algorithm effectively achieves resident groups classification,thereby promoting the development of targeted services.

Labeled Latent Dirichlet allocation(Labeled LDA)Twitter check-in dataresident groups classificationNLP algorithm

管千娇、王长硕

展开 >

南京大学地理与海洋科学学院,南京 210023

标签潜在狄利克雷分布(Labeled LDA) Twitter签到数据 居民群体分类 NLP算法

2024

现代计算机
中大控股

现代计算机

影响因子:0.292
ISSN:1007-1423
年,卷(期):2024.30(16)