Unification Algorithm for Object Tracking and Segmentation Based on Transformer
The discriminant target tracking algorithm based on correlation filtering has received widespread attention because of its exceptional tracking effect.However,bounding box estimation for this method type typically obtains only an axis-aligned box,and it is difficult to acquire a more detailed object representation,such as the rotation bounding box,object contour,and segmentation mask.Therefore,a Transformer based unified Single Object Tracking(SOT)and segmentation algorithm called T-TS is proposed.First,we take advantage of the Transformer that attention mechanism to locate the positioning of the object precisely.Second,the location-encoded is used to guide the target segmentation network to classify the target from the background at pixel level to obtain the fine object mask.Morphological methods are subsequently applied to the mask,which optimize the most fitted rotated bounding box and the object contour.A sufficient set of experiments were conducted on the VOT2018 tracking dataset and the DAVIS segmentation dataset.The proposed T-TS algorithm was more robust than Siamese-based trackers,showed higher accuracy compared with filter-based trackers,achieved an Expected Average Overlap(EAO)index of 0.463,and a high Jaccard index for the segmentation task,DAVIS2016 77.3 and DAVIS2017 65.3,running 34 frame/s.Experimental results demonstrated that the proposed method accurately obtained a rotating rectangular frame,predicts the target,and effectively addresses the target rotation and deformation problem.
Single Object Tracking(SOT)Transformer attention mechanismobject segmentationmorphological methodcorrelation filtering