A military image set captioning method based on image and text relevance and context guidance
Traditional image captioning methods do not generate explanatory description texts due to the lack of a priori knowledge of the real world,while the accuracy of the generated description texts is not high in some specialized fields.To address these problems,the military news image set captioning task is proposed,and a military news image set dataset is also constructed.The task has two key chal-lenges:the description information is derived from the whole image set and the corresponding news arti-cles;the semantics learned by the model is not sufficient.A military news image set captioning method based on image and text relevance and context guidance(ITRCG)is further proposed.Based on ITRCG,cross-modal information interaction is realized,the model is guided to learn more complete se-mantics,and named entity generation is assisted by label cleaning.Experimental validation is conducted on the constructed military news image set dataset,and the results show that ITRCG can effectively im-prove the quality of the description text and achieve improvements in all evaluation metrics.
image captioningimage and text relevance attentioncontext guidance attentionimage setnews text