首页|Detection using mask adaptive transformers in un-manned aerial vehicle imagery

Detection using mask adaptive transformers in un-manned aerial vehicle imagery

扫码查看
Drone photography is an essential building block of intelligent transportation,enabling wide-ranging monitoring,pre-cise positioning,and rapid transmission.However,the high computational cost of transformer-based methods in object detection tasks hinders real-time result transmission in drone target detection applications.Therefore,we propose mask adaptive transformer(MAT)tailored for such scenarios.Specifically,we introduce a structure that supports collabora-tive token sparsification in support windows,enhancing fault tolerance and reducing computational overhead.This structure comprises two modules:a binary mask strategy and adaptive window self-attention(A-WSA).The binary mask strategy focuses on significant objects in various complex scenes.The A-WSA mechanism is employed to self-attend for balance performance and computational cost to select objects and isolate all contextual leakage.Exten-sive experiments on the challenging CarPK and VisDrone datasets demonstrate the effectiveness and superiority of the proposed method.Specifically,it achieves a mean average precision(mAP@0.5)improvement of 1.25%over car de-tector based on you only look once version 5(CD-YOLOv5)on the CarPK dataset and a 3.75%average precision(AP@0.5)improvement over cascaded zoom-in detector(CZ Det)on the VisDrone dataset.

YE Huibiao、FAN Weiming、GUO Yuping、WANG Xuna、ZHOU Dalin

展开 >

China Telecommunication Corporation Zhejiang Branch,Hangzhou 310020,China

Innovation Center for Smart Medical Technologies & Devices,Binjiang Institute of Zhejiang University,Hangzhou 100059,China

School of Computing,University of Portsmouth,Portsmouth,PO1 3HE,UK

2025

光电子快报(英文版)
天津理工大学

光电子快报(英文版)

影响因子:0.641
ISSN:1673-1905
年,卷(期):2025.21(2)