基于Transformer的目标跟踪与分割统一算法

Unification Algorithm for Object Tracking and Segmentation Based on Transformer

林畅 ¹郭伟 ¹任哲聪 ¹金海波¹

扫码查看

作者信息

1. 辽宁工程技术大学软件学院,辽宁葫芦岛 125105
折叠

摘要

采用相关滤波的判别式目标跟踪算法因具有较好的跟踪效果得到广泛关注,但该类方法使用的矩形框估计法通常只能得到目标正矩形框,难以获得目标更加精细的状态信息,如旋转矩形框、目标轮廓、掩码信息等.为解决上述问题,提出一种基于Transformer的单目标跟踪与分割统一算法T-TS,利用Transformer的注意力机制优势对目标精确定位,通过得到的目标定位编码信息引导目标分割网络对目标进行前、背景分割,获得目标精细掩码,再对掩码进行形态学处理,优化得到目标的最佳旋转矩形框及其轮廓.在跟踪数据集VOT2018和分割数据集DAVIS上进行实验,结果显示,T-TS算法与孪生网络类算法相比具有更高的鲁棒性,与相关滤波类算法相比具有更高的跟踪精度,其在VOT2018上期望平均重叠率指标达到0.463,在视频分割任务上也实现了较好结果,DAVIS2016和DAVIS2017上Jaccard指标分别达到77.3和65.3,运行速度达到34 frame/s.实验结果表明,该算法能够准确得到旋转矩形框,对目标进行精准预测,有效解决目标旋转、形变等问题.

Abstract

The discriminant target tracking algorithm based on correlation filtering has received widespread attention because of its exceptional tracking effect.However,bounding box estimation for this method type typically obtains only an axis-aligned box,and it is difficult to acquire a more detailed object representation,such as the rotation bounding box,object contour,and segmentation mask.Therefore,a Transformer based unified Single Object Tracking(SOT)and segmentation algorithm called T-TS is proposed.First,we take advantage of the Transformer that attention mechanism to locate the positioning of the object precisely.Second,the location-encoded is used to guide the target segmentation network to classify the target from the background at pixel level to obtain the fine object mask.Morphological methods are subsequently applied to the mask,which optimize the most fitted rotated bounding box and the object contour.A sufficient set of experiments were conducted on the VOT2018 tracking dataset and the DAVIS segmentation dataset.The proposed T-TS algorithm was more robust than Siamese-based trackers,showed higher accuracy compared with filter-based trackers,achieved an Expected Average Overlap(EAO)index of 0.463,and a high Jaccard index for the segmentation task,DAVIS2016 77.3 and DAVIS2017 65.3,running 34 frame/s.Experimental results demonstrated that the proposed method accurately obtained a rotating rectangular frame,predicts the target,and effectively addresses the target rotation and deformation problem.

关键词

单目标跟踪/Transformer注意力机制/目标分割/形态学方法/相关滤波

Key words

Single Object Tracking(SOT)/Transformer attention mechanism/object segmentation/morphological method/correlation filtering

引用本文复制引用

基金项目

国家自然科学基金(62173171)

出版年

2024

计算机工程

华东计算技术研究所　上海市计算机学会

计算机工程

CSTPCD北大核心

影响因子：0.581

ISSN：1000-3428

段落导航