首页|CANet: Co-attention network for RGB-D semantic segmentation

CANet: Co-attention network for RGB-D semantic segmentation

扫码查看
Incorporating the depth (D) information to RGB images has proven the effectiveness and robustness in semantic segmentation. However, the fusion between them is not trivial due to their inherent physical meaning discrepancy, in which RGB represents RGB information but D depth information. In this paper, we propose a co-attention network (CANet) to build sound interaction between RGB and depth features. The key part in the CANet is the co-attention fusion part. It includes three modules. Specifically, the po-sition and channel co-attention fusion modules adaptively fuse RGB and depth features in spatial and channel dimensions. An additional fusion co-attention module further integrates the outputs of the posi-tion and channel co-attention fusion modules to obtain a more representative feature which is used for the final semantic segmentation. Extensive experiments witness the effectiveness of the CANet in fus-ing RGB and depth features, achieving state-of-the-art performance on two challenging RGB-D semantic segmentation datasets, i.e., NYUDv2 and SUN-RGBD. (c) 2021 Elsevier Ltd. All rights reserved.

RGB-DMulti -modal fusionCo-attentionSemantic segmentationFEATURES

Zhou, Hao、Qi, Lu、Huang, Hai、Yang, Xu、Wan, Zhaoliang、Wen, Xianglong

展开 >

Harbin Engn Univ

Chinese Univ Hong Kong

Chinese Acad Sci

Jihua Lab

展开 >

2022

Pattern Recognition

Pattern Recognition

EISCI
ISSN:0031-3203
年,卷(期):2022.124
  • 32
  • 65