Cross-Border Ethnic Text Clustering Method Based on Domain Knowledge Graph
The task of cross-border ethnic text clustering aims to establish the correlation between different texts of cross-border ethnic groups,which is challenged by substantial differences in cultural text expression among cross-border ethnic groups.This paper proposes a cross-border ethnic text clustering method based on domain knowledge graph.For local semantic information,the method adopts the cross-border ethnic domain knowledge graph to pro-vide the cultural background knowledge and identify the association of entities in the texts.For global semantic in-formation,the method applies the heterogeneous graph attention network is used to extract text features,topics and domain keywords.The variational autoencoding network is finally employed to fuse local information and global in-formation,and the learned feature representation is used for clustering.Experiments show that the proposed method improves Acc by 11.4%,NMI by 1%,and ARI by 9.4%compared with the baseline method.