Semi-supervised image-text relation extraction method based on co-training
In order to overcome the expensive cost of obtaining a large number of relational labeled samples,a semi-supervised image-text relationship extraction model based on co-training is proposed to improve the accuracy of image-text relationship extraction by using a large amount of unlabeled data.First,an image view and a text semantic view are constructed based on two modalities of image and text,and classifiers of two different views are trained on the labeled dataset;then,the data under the two views are crossed into the classifier of the other view,fully mining the information of labeled data and unlabeled data to output more accurate classification results;finally,the classifiers are used in both views to predict unlabeled data to output consistent results.The experimental results on the public datasets VRD and VG show that compared with 6 current state-of-the-art relationship detection methods,the proposed method improves by 2.24%and 1.41%respectively in the VRD dataset for the image view and text semantic view,and 3.59%in the VG dataset.