基于ER Rule的多分类器汽车评论情感分类研究

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：该文针对汽车评论语料的情感二分类问题,提出一种基于证据推理规则的多分类器融合的情感分类方法.在情感特征构建方面,通过实验对比不同特征模型对分类结果的影响,并改进传统的TFIDF权重计算方法.同时,在此基础上使用ER Rule融合不同分类器进行文本情感极性分析,并考虑各分类器的权重和可靠度.最后,爬取汽车网站上的评论数据对上述方法进行测试,并用公开的中文酒店评论语料数据进行了验证,结果表明该方法能够有效集成不同分类器的优点,与传统机器学习分类算法相比,其结果在Recall,F1 值和Accuracy三个指标上得到了提高,与目前流行的深度学习算法和集成学习算法相比,其结果总体占优.

外文标题：Multi-classifier for Car Review Sentiment Classification Based on ER Rule

外文摘要：With the rapid development of the next-generation information technology,more and more users are accustomed to sharing personal experience and opinions through the Internet,such as online reviews of book,movie,product usage experience and so on,which always contain positive and negative sentiment of users.Text sentiment analysis aims to use computer technology to detect and extract diverse sentiments,attitudes,opinions and other perceptual information in text documents,thereby converting qualitative user expressions into quantifi-able data to serve decision-making and strategic planning.For users,these product reviews can provide them with sufficient information that will help them make informed purchasing decisions to the greatest extent and mini-mize the degree of regret after consumption.For manufacturers,consumers’needs can be acquired timely through the reviews,thus adjusting their marketing strategies in a targeted manner and improving the design and quality of products.Currently,due to the exponential growth in the number of these review texts on the Internet,traditional manual analysis methods can hardly satisfy the rapidly changing market demand.Deep learning-based methods may fall into the dilemma of weak interpretability.Therefore,how to automatically obtain users’senti-ment information from numerous comments via a rational and intelligent way is a challenging issue.For the problem of sentimental dichotomy on car commentary corpus,a text sentiment classification method based on ER rule multi-classifier fusion is proposed in this paper.Firstly,the research explores sentiment feature construction by examining the classification effects of various feature models,including unigram,bigram and unigram+bigram.The CHI Square test is adopted for text feature extraction.This method is particularly effective in managing high-dimensional feature spaces,facilitating more accurate sentiment classification by highlighting the most relevant features for analysis.Secondly,the improved TF-IDF method is proposed to enhance the discrimination of terms relevant to sentiment analysis.It incorporates the CHI Square values to assess the distinc-tiveness of terms across different document classes,and refines the traditional TF-IDF calculation.This adjust-ment accounts for the distribution of terms within categories,making the sentiment-related terms more impactful for classification tasks.Thirdly,on the basis of fully considering the weights and reliabilities of different classifiers,the ER rule is introduced to fuse multiple classifiers for text sentiment polarity analysis in order to integrate the advantages of different classifiers.Specifically,the classifier is regarded as evidence,and the weight of classifier is dynamically formed by the Euclidean distance between evidence and the difference in judg-ments of different categories within the evidence.The weight of a classifier is negative with the difference between the results of that classifier and those of all other classifiers,while it is positive with the discrepancy among the judgments of different categories within the classifier.Meanwhile,the accuracy of classifier is assumed to be reliability of the classifier,in order to produce better classification results.In order to verify the effectiveness and rationality of the proposed method,the automobile review data set crawled from the network is used for verification.The result shows that the multi-classifier fusion method based on ER rule can achieve better results in text sentiment classification than single classification algorithm,ensemble algorithm and deep learning algorithm.In addition,to reduce the influence of contingency and single data set,the results are verified using original data sets of hotel comments published in other fields under the same experi-mental conditions.The experimental comparison results show that the fusion method based on ER rules achieves the best results in F1 value and Accuracy index,and also performs well in Precision and Recall indexes.So this method can be well generalized and applied to text sentiment classification tasks in different fields.At the same time,ablation experiments are conducted on the proposed improved method in terms of feature models selection and feature weights calculation.The experimental results show the effectiveness of the improved method in text sentiment classification performance.In summary,the ER rule considers both the weight and reliability of each classifier to fuse multiple classifiers,and integrates the advantages of different classifiers.The method can effec-tively reduce the classification limitations caused by different types and topics of text.The final sentiment classifi-cation results are stable and balanced,which has a wider applicability in the practice of sentiment classification.

外文关键词：

ER rulemulti-classifier fusionTFIDF weightdeep learning algorithmensemble learning algorithm

作者：

周谧、周雅婧、贺洋、方必和

展开 >

作者单位：

合肥工业大学管理学院,安徽合肥 230009

智能决策与信息系统技术教育部工程研究中心,安徽合肥 230009

关键词：

证据推理规则多分类器融合 TFIDF权重深度学习算法集成学习算法

基金：

国家自然科学基金资助项目NSFC-浙江两化融合项目

项目编号：

71521001U1709215

出版年：

2024

DOI：

10.12005/orms.2024.0162

运筹与管理

中国运筹学会

运筹与管理

CSTPCDCHSSCD北大核心

影响因子：0.688

ISSN：1007-3221

年,卷(期)：2024.33(5)