上海电力大学学报2024,Vol.40Issue(1) :80-86.DOI:10.3969/j.issn.2096-8299.2024.01.012

一种融合改进TF-IDF与词典模型的情感分类算法

An Improved Emotion Classification Algorithm Based on Improved TF-IDF and Dictionary Model

王康静 钱江海
上海电力大学学报2024,Vol.40Issue(1) :80-86.DOI:10.3969/j.issn.2096-8299.2024.01.012

一种融合改进TF-IDF与词典模型的情感分类算法

An Improved Emotion Classification Algorithm Based on Improved TF-IDF and Dictionary Model

王康静 1钱江海2
扫码查看

作者信息

  • 1. 上海电力大学数理学院,上海 200090
  • 2. 上海电力大学数理学院,上海 200090;华东师范大学软硬件协同设计技术与应用教育部工程研究中心,上海 200062
  • 折叠

摘要

针对传统情感文本分类算法存在情感特征词的极性偏好区分度较低和稳定性较差等问题,提出了一种改进词频-逆文本频率(TF-IDF)模型与词典模型相融合的情感文本分类算法.首先,通过情感特征词在不同情感类型语料中的频率分布和离散系数,度量情感特征词极性偏好所包含的区分度和稳定性,生成情感特征词极性指标;然后,使用该指标改进TF-IDF模型的情感特征词权重;最后,基于改进的TF-IDF模型,使用带决策函数的有监督分类算法计算情感文本的极性得分,并与词典模型所得的极性得分进行调和平均,得到情感文本综合极性得分.

Abstract

Aiming at the problems of low polarity preference differentiation and poor stability of emotional feature words in traditional emotional text classification algorithms,an improved TF-IDF model integrated with dictionary model is proposed for emotional text classification.Firstly,based on the frequency distribution and dispersion coefficient of emotion feature words in different emotion types of corpus,the differentiation and stability of emotion feature words contained in polarity preference are measured to generate emotion feature word polarity index.Secondly,the index is used to improve the weight of emotion feature words in TF-IDF model.Thirdly,based on the improved TF-IDF model,the supervised classification algorithm with decision function is used to calculate the polarity score of emotional text,and the polarity score obtained from the dictionary model is harmonic averaged to obtain the comprehensive polarity score of emotional text.

关键词

词频-逆文本频率/情感极性/离散系数/词典模型

Key words

term frequency-inverse document frequency/affective polarity/dispersion coefficient/dictionary mode

引用本文复制引用

基金项目

华东师范大学软硬件协同设计技术与应用教育部工程研究中心开放研究基金(OP202102)

出版年

2024
上海电力大学学报
上海电力学院

上海电力大学学报

影响因子:0.401
ISSN:2096-8299
参考文献量15
段落导航相关论文