首页|一种集成本体和SVM的文本分类方法

一种集成本体和SVM的文本分类方法

扫码查看
针对现有的基于SVM的分类方法缺乏对概念语义的处理这一缺点,提出一种集成本体和SVM的文本分类方法.该方法基于领域本体将词特征映射为概念特征,将概念特征及其权重送入SVM进行训练和分类.采用集成本体和SVM的分类方法降低了分类空间的维数,从而节省了分类器的训练时间,也节省了分类期间用于相似度比较的时间.基于概念上下位扩展,解决了父子概念实际联系非常紧密,但在分类的时候却完全视为不同特征词的缺陷.经过对竹藤领域的文本分类实验证明,该方法相比传统的基于SVM的分类方法在分类的准确度上有了较大的提高.
A text classification method integrating ontology and SVM
Aiming at processing the lack of conceptual semantics in the existing SVM-based classification, a text classification method integrated ontology and SVM is proposed. In this method, firstly, word features are mapped to concept features according to domain ontology, and then concept features and their weights will be sent into SVM for training and classification. The classification method integrated ontology and SVM reduces the dimension of classification space, thus saves the training time, and also saves the time for similarity calculation in the period of classification. The conceptional upper and lower extensions can solve the defect that the closely relationship between father and son concept will be treated as different word features in the classification. The text classification experiments in the domain of Bamboo and Rattan proved that compared to traditional SVM-based classification, the accuracy of classification has been greatly improved based on this method.

text classificationontologySVM

朱平、范少辉、岳永德

展开 >

国家林业局国际竹藤网络中心,北京100102

文本分类 本体 支持向量机

国家林业局国际竹藤网络中心科研专项资助项目

1632009006

2012

江西理工大学学报
江西理工大学

江西理工大学学报

影响因子:0.655
ISSN:2095-3046
年,卷(期):2012.33(1)
  • 4
  • 8