现代计算机2024,Vol.30Issue(16) :18-24,29.DOI:10.3969/j.issn.1007-1423.2024.16.003

基于Twitter签到数据的城市居民群体分类算法研究

Research on urban resident groups classification algorithm based on Twitter check-in data

管千娇 王长硕
现代计算机2024,Vol.30Issue(16) :18-24,29.DOI:10.3969/j.issn.1007-1423.2024.16.003

基于Twitter签到数据的城市居民群体分类算法研究

Research on urban resident groups classification algorithm based on Twitter check-in data

管千娇 1王长硕1
扫码查看

作者信息

  • 1. 南京大学地理与海洋科学学院,南京 210023
  • 折叠

摘要

为实现基于社交媒体大数据的居民群体分类,引入自然语言处理(NLP)领域的标签潜在狄利克雷分布(Labeled LDA)模型.基于2014年芝加哥市的Twitter签到数据,使用LDA探索性分析提取先验信息.构建Labeled LDA,将城市居民分为五类:上班族、大学生及高校教职工、中小学生及教职工、市政工作人员和其他.实验结果表明,Labeled LDA的分类精度达到0.92,超过了支持向量机(SVM)0.87的分类精度.该算法有效地实现了居民群体分类,从而促进有针对性的服务制定.

Abstract

To achieve resident groups classification based on social media big data,labeled latent Dirichlet allocation(Labeled LDA)is introduced from the field of natural language processing(NLP).Based on Twitter check-in data from Chicago in 2014,LDA exploratory analysis is used to extract prior information.Constructing a Labeled LDA model,urban residents are ultimately classified into five groups:office workers,college students and faculty,primary and secondary school students and faculty,munici-pal staff,and others.The experimental results indicate that Labeled LDA achieves a recognition precision of 0.92,surpassing that of the support vector machine(SVM)(0.87).This algorithm effectively achieves resident groups classification,thereby promoting the development of targeted services.

关键词

标签潜在狄利克雷分布(Labeled/LDA)/Twitter签到数据/居民群体分类/NLP算法

Key words

Labeled Latent Dirichlet allocation(Labeled LDA)/Twitter check-in data/resident groups classification/NLP algorithm

引用本文复制引用

出版年

2024
现代计算机
中大控股

现代计算机

影响因子:0.292
ISSN:1007-1423
段落导航相关论文