首页|面向中文微博的观点句识别研究

面向中文微博的观点句识别研究

扫码查看
中文微博包含了用户对热点话题的观点,对其进行观点挖掘可以实现突发事件预警、舆情监控等.目前,微博研究多数基于英文语料,中文微博观点句的挖掘大多混淆在情感挖掘中少量提及,由于中文微博特殊的语体特征,导致传统中文文本观点挖掘模型无法取得理想效果.区别于已有的情感挖掘工作,本文依据中文微博的语体特征分析结果选取特征,除了选取情感特征外,还加入主张性动词、语气词、程度副词以及固定词性结构等观点句特征,采用CRFs模型进行观点句识别研究.实验结果表明,仅选取情感特征准确率较高,但召回率仅为32.1%,而加入其他观点句特征后,召回率显著提高到61.8%.该方法应用于2012年中国计算机学会(CCF)组织的“观点句识别”测评任务中,取得了很好的效果.
Study of Subjective Sentence Identification Oriented to Chinese Microblog
Chinese Microblog include many opinions about hot topics.Mining opinion can realize early warning and public sentiment monitoring.Most of researches are usually based on English corpus.The existing researches generally confuse opinion mining and sentiment mining.Because of the specific stylistic features of Chinese Microblog,the traditional Chinese text opinion mining models cannot achieve ideal effects.In this paper,the features selections according to the analysis of the specific stylistic features of Chinese Microblog.Selecting declared verb,modal particles,degree adverb and fixed part of speech structures as the experiment features except the sentimental feature,which distinguish from sentiment mining.This paper used a CRFs(Conditional Random Fields) as the classification model.The results showed that recall ratio is only 32.1%,which is only used the sentimental feature.Added the other features,the recall ratio increased to 61.8%.This method was achieved an ideal effect with the opinion mining task of Chinese microblog which is held by China Computer Federation Technical Committee on China Information Technology.

Chinese microblogopinion miningCRFs modelopinion recognitionstylistic features

丁晟春、孟美任、李霄

展开 >

南京理工大学经济管理学院信息管理系 210094

中文微博 观点挖掘 CRFs模型 观点句识别 语体特征

国家自然科学基金突发事件网络舆情演变过程中的人群仿真研究江苏省高校哲学社会科学重点项目

71103085712731322011ZDIXM028

2014

情报学报
中国科学技术情报学会 中国科学技术信息研究所

情报学报

CSTPCDCSSCICHSSCD北大核心
影响因子:1.296
ISSN:1000-0135
年,卷(期):2014.33(2)
  • 9
  • 4