首页|CMVC+: A Multi-View Clustering Framework for Open Knowledge Base Canonicalization Via Contrastive Learning

CMVC+: A Multi-View Clustering Framework for Open Knowledge Base Canonicalization Via Contrastive Learning

扫码查看
Open information extraction (OIE) methods extract plenty of OIE triples $ $noun phrase, relation phrase, noun phrase$ $ from unstructured text, which compose large open knowledge bases (OKBs). Noun phrases and relation phrases in such OKBs are not canonicalized, which leads to scattered and redundant facts. It is found that two views of knowledge (i.e., a fact view based on the fact triple and a context view based on the fact triple's source context) provide complementary information that is vital to the task of OKB canonicalization, which clusters synonymous noun phrases and relation phrases into the same group and assigns them unique identifiers. In order to leverage these two views of knowledge jointly, we propose CMVC+, a novel unsupervised framework for canonicalizing OKBs without the need for manually annotated labels. Specifically, we propose a multi-view CHF K-Means clustering algorithm to mutually reinforce the clustering of view-specific embeddings learned from each view by considering the clustering quality in a fine-grained manner. Furthermore, we propose a novel contrastive learning module to refine the learned view-specific embeddings and further enhance the canonicalization performance. We demonstrate the superiority of our framework through extensive experiments on multiple real-world OKB data sets against state-of-the-art methods.

Contrastive learningClustering algorithmsKnowledge based systemsOrganizationsElectronic mailData miningOntologiesInformation retrievalIndexesTraining

Yang Yang、Wei Shen、Junfeng Shu、Yinan Liu、Edward Curry、Guoliang Li

展开 >

Insight SFI Research Center for Data Analytics, University of Galway, Galway, Ireland

DISSec, the College of Computer Science, Nankai University, Tianjin, China

School of Computer Science and Engineering, Northeastern University, Shenyang, China

Department of Computer Science, Tsinghua University, Beijing, China

展开 >

2025

IEEE transactions on knowledge and data engineering

IEEE transactions on knowledge and data engineering

ISSN:
年,卷(期):2025.37(5)
  • 50