山东大学学报(工学版)2024,Vol.54Issue(1) :45-51,62.DOI:10.6040/j.issn.1672-3961.0.2023.168

用于意图识别的自适应多标签信息学习模型

Adaptive label information learning for intention detection

马坤 刘筱云 李乐平 纪科 陈贞翔 杨波
山东大学学报(工学版)2024,Vol.54Issue(1) :45-51,62.DOI:10.6040/j.issn.1672-3961.0.2023.168

用于意图识别的自适应多标签信息学习模型

Adaptive label information learning for intention detection

马坤 1刘筱云 1李乐平 1纪科 1陈贞翔 1杨波1
扫码查看

作者信息

  • 1. 济南大学信息科学与工程学院,山东 济南 250022
  • 折叠

摘要

为解决多标签文本分类在捕获标签关系时忽视标签共现特性的问题,提出基于统计特征的自适应多标签信息学习方法(adaptive label feature learning,ALFL),用于检测内容营销文章.构建主题先验自适应标记狄利克雷主题模型(labeled latent dirichlet allocation with adaptive topic priors,LDATP),根据每个文本的标签集合情况,与标签集合对应的全部营销主题约束模型生成主题词概率分布;构建标签信息整合网络(label information integration network,LIIN),利用主题词概率分布和标签的图结构学习标签相关信息,获得标签嵌入表示;进行文本和标签空间之间的信息交互,捕获语义特征以识别营销文章.试验结果表明,基于统计特征的ALFL方法以召回率为 80.92%、准确率为 88.14%,优于其他基线模型,具有更高的预测准确性.

Abstract

In order to solve the problem of ignoring label co-occurrence characteristics when capturing label relationships in multi-label text classification,an adaptive label feature learning(ALFL)method based on statistical features was proposed for detecting content marketing articles.Based on the set of labels for each text,ALFL generated the topic-word probability distribution by labeled latent dirichlet allocation with adaptive topic priors(LDATP)that used all the marketing topics corresponding to the label set to constraint model;ALFL constructed the label information integration network(LIIN),used the topic-word probability distribution and label graph structure to learn the label related information,obtained the label embedded representation;it conducted information interaction between text and label space,capturing more semantic features to identify marketing articles.The experimental results showed that the ALFL method based on statistical features outperformed other baseline models with a recall rate of 80.92%and an accuracy rate of 88.14%,had higher prediction accuracy.

关键词

多标签文本分类/标签共现/主题模型/图结构/标签嵌入

Key words

multi-label text classification/label co-occurrence/topic model/graph structure/label embedding

引用本文复制引用

基金项目

国家自然科学基金资助项目(61772231)

山东省自然科学基金资助项目(ZR2022LZH016)

山东省重点研发计划(重大创新工程)资助项目(2021CXGC010103)

出版年

2024
山东大学学报(工学版)
山东大学

山东大学学报(工学版)

CSTPCD北大核心
影响因子:0.634
ISSN:1672-3961
参考文献量19
段落导航相关论文