Video and Image Salient Object Detection Based on Multi-task Learning
Salient object detection(SOD)can quickly identify high-value salient objects in complex scenes,which simulates hu-man attention and lays the foundation for further vision understanding tasks.Currently,the mainstream methods for image-based salient object detection are usually trained on DUTS-TR dataset,while video-based salient object detection(VSOD)methods are trained on DAVIS,DAVSOD,and DUTS-TR datasets.Because image and video salient object detection tasks have general and specific characteristics,independent models need to be deployed for separate training,which greatly increases computational re-sources and training time.Current research typically focuses on independent solution for a single task.However,a unified method for both image and video salient object detection is lack of research.To address on aforementioned issues,this paper proposes a multi-task learning-based method for image and video salient object detection,aiming to build a universal framework which simul-taneously adapts to both tasks with a single training process,and further bridges the performance gaps between image and video salient object detection models.Qualitative and quantitative experimental results on 12 datasets show that the proposed method can not only adapt to both tasks,but also achieve better detection results than single-task models.