基于跟踪检测时序特征融合的视频遮挡目标分割方法

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：视频实例分割是近年来兴起的一项在图像实例分割基础上引入时序特性的视觉任务,旨在同时对每一帧的目标进行分割并实现帧间的目标跟踪.移动互联网和人工智能的迅猛发展产生了大量的视频数据,但由于拍摄角度、快速运动和部分遮挡等,视频中的物体往往会出现分裂或模糊的情况,使得从视频数据中准确地分割目标并对目标进行处理和分析面临着重大挑战.经查阅和实践发现,现有的视频实例分割方法在遮挡情况下的表现较差.针对上述问题,提出了一种改进的遮挡视频实例分割算法——通过融合Transformer和跟踪检测的时序特征来改善分割性能.为增强网络对空间位置信息的学习能力,该算法将时间维度引入Transformer网络中,并考虑到视频中目标检测、跟踪和分割之间的相互依赖和促进关系,提出了一种能够有效地聚合目标在视频中的跟踪偏移的融合跟踪模块和检测时序特征模块,提升了遮挡环境下的目标分割性能.通过在OVIS和YouTube-VIS数据集上进行的实验,验证了所提方法的有效性.相比当前的基准方法,该方法展现出了更好的分割精度,进一步证明了其优越性.

外文标题：Occluded Video Instance Segmentation Method Based on Feature Fusion of Tracking and Detection in Time Sequence

外文摘要：Video instance segmentation is a visual task that has emerged in recent years,which introduces temporal characteristics on the basis of image instance segmentation.It aims to simultaneously segment objects in each frame and achieve inter frame ob-ject tracking.A large amount of video data has been generated with the rapid development of mobile Internet and artificial intelli-gence.However,due to shooting angles,rapid motion,and partial occlusion,objects in videos often split or blur,posing significant challenges in accurately segmenting targets from video data and processing and analyzing them.After consulting and practicing,it is found that existing video instance segmentation methods perform poorly in occluded situations.In response to the above issues,this paper proposes an improved occlusion video instance segmentation algorithm,which improves segmentation performance by integrating the temporal features of Transformer and tracking detection.To enhance the learning ability of the network for spatial position information,this algorithm introduces the time dimension into the Transformer network and considers the interdepen-dence and promotion relationship between object detection,tracking,and segmentation in videos.A fusion tracking module and a detection temporal feature module that can effectively aggregate the tracking offset of objects in videos are proposed,improving the performance of object segmentation in occluded environments.The effectiveness of the proposed method is verified through experiments on the OVIS and YouTube VIS datasets.Compared to the current benchmark method,the proposed method exhibits better segmentation accuracy,further demonstrating its superiority.

外文关键词：

Video instance segmentationObject detectionObject trackingFeature in time sequenceOccluded instance

作者：

郑申海、高茜、刘鹏威、李伟生

展开 >

作者单位：

重庆邮电大学计算机科学与技术学院重庆 400065

图像认知重庆市重点实验室(重庆邮电大学) 重庆 400065

关键词：

视频实例分割目标检测目标跟踪时序特征遮挡目标

基金：

国家自然科学基金重庆市教委科学技术研究计划重点项目重庆市自然科学基金

项目编号：

61902046KJZD-K2022006062022NSCQ-MSX3746

出版年：

2024

DOI：

10.11896/jsjkx.230600186

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCD北大核心

影响因子：0.944

ISSN：1002-137X

年,卷(期)：2024.51(z1)

参考文献量25