首页|Comprehensive Relation Modelling for Image Paragraph Generation

Comprehensive Relation Modelling for Image Paragraph Generation

扫码查看
Image paragraph generation aims to generate a long description composed of multiple sentences,which is different from tra-ditional image captioning containing only one sentence.Most of previous methods are dedicated to extracting rich features from image regions,and ignore modelling the visual relationships.In this paper,we propose a novel method to generate a paragraph by modelling visual relationships comprehensively.First,we parse an image into a scene graph,where each node represents a specific object and each edge denotes the relationship between two objects.Second,we enrich the object features by implicitly encoding visual relationships through a graph convolutional network(GCN).We further explore high-order relations between different relation features using anoth-er graph convolutional network.In addition,we obtain the linguistic features by projecting the predicted object labels and their relation-ships into a semantic embedding space.With these features,we present an attention-based topic generation network to select relevant features and produce a set of topic vectors,which are then utilized to generate multiple sentences.We evaluate the proposed method on the Stanford image-paragraph dataset which is currently the only available dataset for image paragraph generation,and our method achieves competitive performance in comparison with other state-of-the-art(SOTA)methods.

Image paragraph generationvisual relationshipscene graphgraph convolutional network(GCN)long short-term memory

Xianglu Zhu、Zhang Zhang、Wei Wang、Zilei Wang

展开 >

Automation Department,University of Science and Technology of China,Hefei 230027,China

Center for Research on Intelligent Perception and Computing,National Laboratory of Pattern Recognition,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China

University of Chinese Academy of Sciences,Beijing 100864,China

National Natural Science Foundation of ChinaNational Natural Science Foundation of ChinaNational Natural Science Foundation of ChinaNational Natural Science Foundation of China

61721004619762146207607862176246

2024

机器智能研究(英文)
中国科学院自动化所

机器智能研究(英文)

CSTPCDEI
影响因子:0.49
ISSN:2731-538X
年,卷(期):2024.21(2)
  • 38