基于Vision Transformer的多模态链接预测方法研究

扫码查看

原文链接

万方数据

中文摘要：针对现有链接预测方法特征表示不充分等问题,提出了一种基于Vision Transformer的多模态链接预测方法.首先在过滤门采用pHash过滤掉不相关的图片.其次,利用Vision Transformer模型提取图片特征,并通过遗忘门使用MRP进行计算.最后,在融合门中将图片特征与实体关系特征进行融合.实验结果表明,所设计的模型在WN18-IMG和OpenBG-IMG数据集上取得了较好的多模态链接预测效果.

外文标题：Research on multimodal link prediction method based on Vision Transformer

外文摘要：Aiming at the problems such as insufficient feature representation of existing link prediction methods.A multimodal link prediction method based on Vision Transformer is proposed.Firstly,pHash is employed at the filter gate to filter out irrelevant images.Secondly,picture features are extracted using Vision Transformer model and computed using MRP through oblivion gate.Fi-nally,the picture features are fused with entity relationship features in the fusion gate.The experimental results show that the de-signed model achieves good multimodal link prediction on the WN18-IMG and OpenBG-IMG datasets.

外文关键词：

link predictionpHashTransformermultimodal

作者：

任泽红

展开 >

作者单位：

华北水利水电大学信息工程学院,郑州 450046

关键词：

链接预测 pHash Transformer 多模态

出版年：

2024

DOI：