首页|基于有监督双词主题模型的短文本分类方法

基于有监督双词主题模型的短文本分类方法

扫码查看
针对短文本存在的语义稀疏及语义模糊等问题,提出一种有监督的双词主题模型(Su-BTM),将其应用于短文本分类.在BTM主题模型的基础上引入主题-类别分布参数,识别主题-类别语义信息,建立主题与类别的准确映射,并提出Su-BTM-Gibbs主题采样方法,对每个词的隐含主题进行采样.在两个中英文短文本数据集上进行对比实验,实验结果表明,该方法相比经典模型具有更优的分类效果.
A Short Text Classification Method Based on Supervised Biterm Topic Model
In response to the problems of semantic sparsity and ambiguity in short texts,this paper proposes a Supervised Biterm Topic Model(Su-BTM)and applies it to short text classification.Based on the BTM topic model,distribution parameter between topic and category is introduced to identify semantic information between topic and category,accurate mapping between topic and category is established,and a Su-BTM-Gibbs topic sampling method is proposed to sample the implied topics of each word.Comparative experiments are conducted on two datasets of Chinese and English short texts,and the results show that this method has better classification performance compared to classical models.

semantic sparsityBTM topic modelimplied topicshort text classification

卫红敏

展开 >

山东华宇工学院,山东 德州 253034

语义稀疏 BTM主题模型 隐含主题 短文本分类

2024

现代信息科技
广东省电子学会

现代信息科技

ISSN:2096-4706
年,卷(期):2024.8(10)