首页|基于泛化图卷积神经网络的深度文档聚类模型

基于泛化图卷积神经网络的深度文档聚类模型

扫码查看
文本分类是自然语言处理中一项重要任务,基于图神经网络的文本分类因其可建模文本间的多种交互成为一种主流方法。但现有方法大都依赖标签,而真实标签难以获取。提出一个基于图泛化卷积神经网络的深度文档聚类模型(generalization graph convolutional neural network-deep document clustering,GGCN-DDC),同时实现文本表示学习和无监督文档分类。该模型首先将每个文档建模为文本图;然后采用泛化卷积层学习更有区分力的文档词特征表示和文档表示;最后通过文档聚类损失和文档图重建损失约束参数学习算法。在 3 个基准数据集上的实验表明,GGCN-DDC在多个指标上均优于其他基准算法。
Deep Document Clustering Model Based on Generalization Graph Convolutional Neural Network
Text classification is an important task in natural language processing.The method of text classification on graph neural network has become a mainstream method since it can model the interactions among texts.However,most of the existing graph-based classification methods rely on real labels,which are difficult to captain.A deep document clustering model based on graph generalization convolutional neural network(GGCN-DDC)is proposed,which can realize unsupervised text classification while learning text representation.Firstly,the documents are modeled as a text graph.Then generalized convolution layer is used to learn the more distinguishable feature representations of words and the document representations.Finally,The learning algorithm of parameters is constrained by document clustering and reconstructing document graph.Experiments on three benchmark datasets show that GGCN-DDC outperforms other benchmark algorithms on several measures.

graph neural networkdeep graph clusteringtext classificationtext representation

柴变芳、李政、赵晓鹏、王荣娟

展开 >

河北地质大学信息工程学院,河北 石家庄 050031

河北省财政厅一体化系统运维中心,河北 石家庄 050091

河北地质职工大学,河北 石家庄 050086

图神经网络 深度图聚类 文本分类 文本表示

河北省高等学校科学技术研究项目河北地质大学 2023 国家预研项目

ZD2020175KY202310

2024

南京师大学报(自然科学版)
南京师范大学

南京师大学报(自然科学版)

CSTPCD北大核心
影响因子:0.427
ISSN:1001-4616
年,卷(期):2024.47(1)
  • 22