RGB-D Pepper Image Saliency Detection Based on Cross-modal Feature Fusion
To address the inability of existing saliency detection models to utilize the features of pepper branch color images and depth images effectively,an attention-based RGB-D image pepper branch saliency detection model is proposed.Color and depth image features are extracted separately by two single-stream convolutional networks.A cross-modal fusion module based on spatial and channel attention mechanisms is designed to fuse multi-scale color stream and depth stream features.A multi-scale supervision mechanism is developed to alleviate the inaccurate edge prediction caused by the use of nearest-neighbor upsampling decoding.Experimental results show that the average accuracy,average recall rate,comprehensive evaluation index and average absolute error of the proposed method are all superior to the compared salient object detection methods.