Discriminative unimodal feature selection and fusion for RGB-D salient object detection

扫码查看

原文链接

NSTL
Elsevier

外文摘要：Most existing RGB-D salient object detectors make use of the complementary information of RGB-D im-ages to overcome the challenging scenarios, e.g., low contrast, clutter backgrounds. However, these mod-els generally neglect the fact that one of the input images may be poor in quality. This will adversely affect the discriminative ability of cross-modal features when the two channels are fused directly. To ad-dress this issue, a novel end-to-end RGB-D salient object detection model is proposed in this paper. At the core of our model is a Semantic-Guided Modality-Weight Map Generation (SG-MWMG) sub-network, producing modality-weight maps to indicate which regions on both modalities are high-quality regions, given input RGB-D images and the guidance of their semantic information. Based on it, a Bi-directional Multi-scale Cross-modal Feature Fusion (Bi-MCFF) module is presented, where the interactions of the features across different modalities and scales are exploited by using a novel bi-directional structure for better capturing cross-scale and cross-modal complementary information. The experimental results on several benchmark datasets verify the effectiveness and superiority of the proposed method over some state-of-the-art methods. (c) 2021 Published by Elsevier Ltd.

外文关键词：

RGB-D salient object detectionDiscriminative unimodal feature selectionSemantic informationMulti-scale cross-modal feature fusionMODEL

作者：

Huang, Nianchang、Luo, Yongjiang、Zhang, Qiang、Han, Jungong

展开 >

作者单位：

Xidian Univ

Aberystwyth Univ

出版年：

2022

DOI：

10.1016/j.patcog.2021.108359

Pattern Recognition

EISCI

ISSN：0031-3203

年,卷(期)：2022.122

被引量9
参考文献量44