首页|周期分类和Single-Pass聚类相结合的话题识别与跟踪方法

周期分类和Single-Pass聚类相结合的话题识别与跟踪方法

扫码查看
针对增量式聚类初始时话题模型不够充分和准确,随处理报道数量增加,误检与漏检的累积效应被放大的问题,提出了周期分类和Single-Pass聚类相结合的话题识别与跟踪方法.首先采用增量式聚类算法进行话题识别与跟踪,当新闻文本每积累到一定程度之后,对已经聚类的报道进行周期分类,使话题簇精度提高,从而提高后续话题识别与跟踪精度.实验表明这种方法是有效的,能够降低漏检率与错检率,减少归一化错误识别代价.
A New Topic Detection and Tracking Approach Combining Periodic Classification and Single-Pass Clustering
For the insufficient model and accuracy of incremental cluster topic, the problems of miss alarm and false alarm may be increased due to the accumulate effects. The topic detection and tracking method of periodic classification and signle-pass cluster was proposed in this paper, the main ideal is to employ the incremental clustering algorithm to detect and track topic, When the every news text accumulate to a certain degree, the clustering reports were cycle classifyed to improve the accuracy of topic clusters, and follow-up to improve the accuracy of topic detection and tracking. The experiment results shown the effectivity of the method, which could decrease the probabilities of miss alarm and false alarm, then finally reducing the normalized detection cost.

topic detection and trackingincremental clusteringtext categorizationk-nearest neighbor classifier

税仪冬、瞿有利、黄厚宽

展开 >

北京交通大学计算机与信息技术学院,北京,100044

话题识别与跟踪 增量聚类 文本分类 k-最近邻方法分类

教育部科学技术研究重点项目

108126

2009

北京交通大学学报
北京交通大学

北京交通大学学报

CSTPCDCSCD北大核心
影响因子:0.525
ISSN:1673-0291
年,卷(期):2009.33(5)
  • 26
  • 3