光电子快报(英文版)2025,Vol.21Issue(2) :113-120.DOI:10.1007/s11801-025-4185-7

Detection using mask adaptive transformers in un-manned aerial vehicle imagery

YE Huibiao FAN Weiming GUO Yuping WANG Xuna ZHOU Dalin
光电子快报(英文版)2025,Vol.21Issue(2) :113-120.DOI:10.1007/s11801-025-4185-7

Detection using mask adaptive transformers in un-manned aerial vehicle imagery

YE Huibiao 1FAN Weiming 2GUO Yuping 2WANG Xuna 2ZHOU Dalin3
扫码查看

作者信息

  • 1. China Telecommunication Corporation Zhejiang Branch,Hangzhou 310020,China
  • 2. Innovation Center for Smart Medical Technologies & Devices,Binjiang Institute of Zhejiang University,Hangzhou 100059,China
  • 3. School of Computing,University of Portsmouth,Portsmouth,PO1 3HE,UK
  • 折叠

Abstract

Drone photography is an essential building block of intelligent transportation,enabling wide-ranging monitoring,pre-cise positioning,and rapid transmission.However,the high computational cost of transformer-based methods in object detection tasks hinders real-time result transmission in drone target detection applications.Therefore,we propose mask adaptive transformer(MAT)tailored for such scenarios.Specifically,we introduce a structure that supports collabora-tive token sparsification in support windows,enhancing fault tolerance and reducing computational overhead.This structure comprises two modules:a binary mask strategy and adaptive window self-attention(A-WSA).The binary mask strategy focuses on significant objects in various complex scenes.The A-WSA mechanism is employed to self-attend for balance performance and computational cost to select objects and isolate all contextual leakage.Exten-sive experiments on the challenging CarPK and VisDrone datasets demonstrate the effectiveness and superiority of the proposed method.Specifically,it achieves a mean average precision(mAP@0.5)improvement of 1.25%over car de-tector based on you only look once version 5(CD-YOLOv5)on the CarPK dataset and a 3.75%average precision(AP@0.5)improvement over cascaded zoom-in detector(CZ Det)on the VisDrone dataset.

引用本文复制引用

出版年

2025
光电子快报(英文版)
天津理工大学

光电子快报(英文版)

影响因子:0.641
ISSN:1673-1905
段落导航相关论文