基于协同训练的半监督图文关系抽取方法
Semi-supervised image-text relation extraction method based on co-training
王亚萍 1王智强 2王元龙 2梁吉业2
作者信息
- 1. 山西大学 计算机与信息技术学院,山西 太原 030006
- 2. 山西大学 计算机与信息技术学院,山西 太原 030006;山西大学 计算智能与中文信息处理教育部重点实验室,山西 太原 030006
- 折叠
摘要
为克服获取大量关系标记样本的昂贵代价,提出基于协同训练的半监督图文关系抽取模型,以利用大量无标记的数据来提升图文关系抽取的准确性.首先,基于图像和文本 2 种模态构建图像视图和文本语义视图,在标记数据集上训练2 种不同视图的分类器;然后,将 2 种视图下的数据分别交叉输入另一视图的分类器,充分挖掘标记数据和未标记数据的信息,输出更准确的分类结果;最后,2 种视图下的分类器对未标记数据进行预测,以输出一致的结果.在公开数据集VRD和VG上的实验结果显示,与6 种较新的关系检测方法相比,该文方法图像视图和语义视图参数在VRD数据集上分别提升了2.24%、1.41%,在VG数据集上提升了3.59%.
Abstract
In order to overcome the expensive cost of obtaining a large number of relational labeled samples,a semi-supervised image-text relationship extraction model based on co-training is proposed to improve the accuracy of image-text relationship extraction by using a large amount of unlabeled data.First,an image view and a text semantic view are constructed based on two modalities of image and text,and classifiers of two different views are trained on the labeled dataset;then,the data under the two views are crossed into the classifier of the other view,fully mining the information of labeled data and unlabeled data to output more accurate classification results;finally,the classifiers are used in both views to predict unlabeled data to output consistent results.The experimental results on the public datasets VRD and VG show that compared with 6 current state-of-the-art relationship detection methods,the proposed method improves by 2.24%and 1.41%respectively in the VRD dataset for the image view and text semantic view,and 3.59%in the VG dataset.
关键词
协同训练/半监督/多模态/关系抽取/视觉关系检测Key words
co-training/semi-supervised/multimodal/relationship extraction/visual relationship detection引用本文复制引用
基金项目
国家自然科学基金(61876103)
国家自然科学基金(61906111)
出版年
2024