结合对象属性识别的图像场景图生成方法研究

扫码查看

原文链接

万方数据
维普

中文摘要：场景图生成在视觉场景深度理解任务中发挥着重要的作用.现有的场景图生成方法主要关注场景中对象的位置、类别以及对象之间的关系,而忽略了对象属性蕴含的丰富场景语义信息.为了将图像属性语义融入场景图,提出了一种结合对象属性识别的图像场景图生成方法.首先针对属性识别的多标签分类问题,提出了一种基于混合分类器的属性分类损失函数来进行属性识别,通过结合二值交叉熵函数训练的二分类器和改进的团组交叉熵函数训练的多分类器来实现单个属性分类的查准率和多个属性预测的查全率全面提升.其次,通过将属性识别分支与原有场景图框架进行融合,将提取的属性信息作为额外的上下文语义与对象特征进行融合后辅助对象之间关系的识别.最后,模型在VG150数据集上与多个基准模型进行了对比实验,结果表明所提模型的对象属性预测和关系识别均取得了更优的结果.

外文标题：Scene Graph Generation Combined with Object Attribute Recognition

外文摘要：Scene graph generation(SGG)plays an important role in deep visual understanding tasks.Existing SGG methods main-ly focus on the locations and categories of objects,as well as the relationship between objects,while ignoring that the object at-tributes also contain rich semantic information.This paper proposes a SGG model integrating with the object attributes.Firstly,to achieve multi-label object attribution recognition,we propose the composite classifiers that combine the multi-class classification trained by improved group cross entropy loss and binary classification trained by binary cross entropy loss,which can improve the accuracy and recall of multiple attribute predictions.Then,the branch of attribution recognition is fused into the SGG framework.As a kind of context information,the attribution features are fed into the relationship branch for better relationship classification.Finally,compared with several baseline models,our method has achieved better performance in both object attribute prediction and relationship recognition on VG150 dataset.

外文关键词：

Scene graph generationObject attribute recognitionAttribute fusionRelationship classificationsMulti-label lear-ningGroup cross entropy function

作者：

周浩、罗廷金、崔国恒

展开 >

作者单位：

海军工程大学作战运筹与规划系武汉 430033

国防科技大学理学院长沙 410073

关键词：

场景图生成对象属性识别属性融合关系预测多标签分类团组交叉熵函数

基金：

国家自然科学青年科学基金国家自然科学基金项目湖北省自然科学基金项目湖南省湖湘青年人才项目

项目编号：

62302516623762812022CFC0492021RC3070

出版年：

2024

DOI：

10.11896/jsjkx.230900013

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCD北大核心

影响因子：0.944

ISSN：1002-137X

年,卷(期)：2024.51(11)