首页|基于多任务学习的视频和图像显著目标检测方法

基于多任务学习的视频和图像显著目标检测方法

扫码查看
显著目标检测(Salient Object Detection,SOD)能够模拟人类的注意力机制,在复杂的场景中快速发现高价值的显著目标,为进一步的视觉理解任务奠定了基础.当前主流的图像显著目标检测方法通常基于DUTS-TR数据集进行训练,而视频显著目标检测方法(Video Salient Object Detection,VSOD)基于DAVIS,DAVSOD以及DUTS-TR数据集进行训练.图像和视频显著目标检测任务既有共性又有特性,因此需要部署独立的模型进行单独训练,这大大增加了运算资源和训练时间的开销.当前研究大多针对单个任务提出独立的解决方案,而缺少统一的图像和视频显著目标检测方法.针对上述问题,提出了一种基于多任务学习的图像和视频显著目标检测方法,旨在构建一种通用的模型框架,通过一次训练同时适配两种任务,并进一步弥合图像和视频显著目标检测方法之间的性能差异.12个数据集上的定性和定量实验结果表明,所提方法不仅能够同时适配两种任务,而且取得了比单任务模型更好的检测结果.
Video and Image Salient Object Detection Based on Multi-task Learning
Salient object detection(SOD)can quickly identify high-value salient objects in complex scenes,which simulates hu-man attention and lays the foundation for further vision understanding tasks.Currently,the mainstream methods for image-based salient object detection are usually trained on DUTS-TR dataset,while video-based salient object detection(VSOD)methods are trained on DAVIS,DAVSOD,and DUTS-TR datasets.Because image and video salient object detection tasks have general and specific characteristics,independent models need to be deployed for separate training,which greatly increases computational re-sources and training time.Current research typically focuses on independent solution for a single task.However,a unified method for both image and video salient object detection is lack of research.To address on aforementioned issues,this paper proposes a multi-task learning-based method for image and video salient object detection,aiming to build a universal framework which simul-taneously adapts to both tasks with a single training process,and further bridges the performance gaps between image and video salient object detection models.Qualitative and quantitative experimental results on 12 datasets show that the proposed method can not only adapt to both tasks,but also achieve better detection results than single-task models.

Video-based salient object detectionImage-based salient object detectionMulti-task learningPerformance gaps

刘泽宇、刘建伟

展开 >

中国石油大学(北京)信息科学与工程学院 北京 102249

视频显著目标检测 图像显著目标检测 多任务学习 性能差异

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(4)
  • 71