首页|基于爬虫和SVM的微博评论情感分析研究

基于爬虫和SVM的微博评论情感分析研究

扫码查看
微博作为热点时事的重要传播平台,每个文章或视频下面的评论是各路网友关注的重点,手动下拉后复制粘贴微博评论是日常行为,但此操作会降低情感分析速率.针对以上情况,文章采用Selenium技术模拟人登录和输入验证码,导入Requests库对网页源代码进行解析后保存微博评论.将ChnSentiCorp情感分析语料库导入支持向量机(Support Vector Machines,SVM)分类模型进行训练,对所爬取的微博评论进行文本预处理后,用训练好的SVM模型对微博评论进行情感分类.分类后的实验结果表明:SVM分类精度较低,主要原因是情感分析语料库并不具有广泛性,利用爬虫技术自建微博评论语料库,导入分类模型进行训练,会使得情感分类的准确性更高.
Research on sentiment analysis of Weibo comments based on crawlers and SVM
As an important communication platform for hot current affairs,Weibo is the focus of the attention of netizens on each article or video.Copy and paste Weibo comments after manual pulling down is daily behavior,but this operation will slow down the emotional resolution rate.For the above situations,Selenium technology is used to simulate human login and input verification codes,and import the Requests library to analyze the web source code and save Weibo reviews.Import the ChnSentiCorp emotional analysis library into the support vector machine(SVM)classification model,and after the text pre-processing of the climbing Weibo comments,the Weibo comments are classified by the trained SVM model.The classification experimental results show that the SVM classification accuracy is low.The main reason is that the emotional analysis language library is not widely available.The use of crawlers to self-built Weibo review corpus and introduce training in the classification model will make the accuracy of emotional classification higher.

Weibo commentsSelenium technologyChnSentiCorp emotional analysis librarySVMself-built Weibo review corpus

汪兰兰

展开 >

武汉工程科技学院,湖北 武汉 430200

微博评论 Selenium技术 ChnSentiCorp情感分析语料库 SVM 自建微博评论语料库

2024

无线互联科技
江苏省科学技术情报研究所

无线互联科技

影响因子:0.263
ISSN:1672-6944
年,卷(期):2024.21(9)