首页|融入类别标签和主题信息的用户兴趣识别方法

融入类别标签和主题信息的用户兴趣识别方法

扫码查看
社交网络用户兴趣发现对信息过载缓解、个性化推荐和信息传播正向引导等方面具有重要意义.目前已有的兴趣识别研究未能同时考虑文本主题信息及其对应的类别标签信息对模型学习文本特征的帮助,文中提出了一种融入类别标签和主题信息的用户兴趣识别方法.首先,利用BERT预训练模型、BiLSTM模型和多头自注意力机制分别获取文本和标签序列的语义特征;其次,引入标签注意力机制,使模型更加关注文本与其类别标签更相关的词语信息;然后,利用LDA主题模型和Word2Vec模型得到文本主题特征;接着,设计门控机制进行特征融合,使模型能够自适应地融合多种特征,进而实现微博文本兴趣类别分类;最后,统计用户发表的所有文本在各个兴趣类别上的数量,将数量最多的兴趣类别确定为用户兴趣识别结果.为验证所提方法的有效性,文中构建了一个微博兴趣识别数据集.实验结果表明,该模型在微博文本兴趣类别分类和用户兴趣识别任务中均取得了最优性能.
User Interest Recognition Method Incorporating Category Labels and Topic Information
The discovery of social media user interest is of great significance in information overload alleviation,personalized rec-ommendation,and positive guidance of information dissemination.Existing research of interest recognition fails to consider the help of topic information and corresponding category labels information for model learning text features at the same time.There-fore,a user interest recognition method incorporating category labels and topic information is proposed.Firstly,semantic features of text and label sequences are extracted separately by using the BERT pre-trained model,BiLSTM model,and multi-head self-at-tention mechanism.Then,a label attention mechanism is introduced to make the model pay more attention to the words related to the text's corresponding category label.Secondly,text topic features are obtained by using the LDA topic model and Word2Vec model.Subsequently,a gating mechanism is designed for feature fusion to enable the model to adaptively merge multiple features,thereby realizing text interest classification.Finally,the number of texts published by users in each interest category is counted,and the interest category with the highest count is determined as users'interest recognition results.To verify the effectiveness of the proposed method,a Weibo users'interest recognition dataset is constructed.Experimental results show that the model achieves optimal performance in Weibo text classification and user interest recognition tasks.

Social networkInterest recognitionTopic modelLabel attention mechanismFeature fusion

康智勇、李弼程、林煌

展开 >

华侨大学计算机科学与技术学院 厦门 361021

社交网络 兴趣识别 主题模型 标签注意力机制 特征融合

装备预研教育部联合基金

8091B022150

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(z1)
  • 29