位置敏感Transformer航拍图像目标检测模型

Position-sensitive Transformer aerial image object detection model

扫码查看

原文链接

维普
万方数据

中文摘要：针对无人机视角下航拍图像小目标多且检测困难的问题,提出了一个位置敏感Transformer目标检测(PS-TOD)模型.设计了一个基于位置通道嵌入三维注意力(PCE3DA)的多尺度特征融合(MSFF)模块,即PCE3DA利用空间与通道信息的相互依赖关系生成三维注意力,用于加强模型对兴趣区域的特征表达能力,且基于它构造了一个自底向上的跨层MSFF方案,使得融合后的特征语义信息更加丰富;然后,设计了一种新的位置敏感自注意力(PSSA)机制,且以此构造位置敏感Transformer编-解码器,使模型在捕获图像全局上下文信息的长期依赖关系时,也可提高模型对目标的位置敏感能力.基于无人机航拍数据集VisDrone的对比实验结果表明,提出模型的AP达到28.8%,与基线模型(DETR)相比提高了4.1%.该模型在复杂背景下能对无人机航拍图像进行精确的目标检测,且改善小目标的检测效果.

外文摘要：Addressing the challenge of detecting numerous small objects in UAV-captured aerial images,this paper introduces the Position-Sensitive Transformer Target Detection(PS-TOD)model.Initially,it presents a multi-scale feature fusion(MSFF)module incorporating a Positional Channel Embedded 3D At-tention(PCE3DA)mechanism.PCE3DA leverages the interplay between spatial and channel data to gen-erate 3D attention,enhancing feature representation in areas of interest.This foundation supports a bottom-up,cross-layer MSFF approach,augmenting the semantic richness of combined features.Subsequently,it proposes a novel Position-Sensitive Self-Attention(PSSA)mechanism,leading to the development of a position-sensitive Transformer encoder-decoder.This innovation heightens the model's sensitivity to target positioning,facilitating the capture of long-term dependencies within the image's global context.Compara-tive tests using the VisDrone dataset reveal that the PS-TOD model attains an Average Precision(AP)of 28.8%,marking a 4.1%enhancement over the baseline model(DETR).Furthermore,it demonstrates precise object detection in UAV aerial imagery against complex backdrops,significantly boosting the de-tection accuracy of small targets.

外文关键词：

object detectionunmanned aerial vehicle imageposition sensitive Transformermulti-scale feature fusionattention mechanism

作者：

李大湘、辛嘉妮、刘颖

展开 >

作者单位：

西安邮电大学通信与信息工程学院,陕西西安 710121

关键词：

目标检测无人机图像位置敏感Transformer 多尺度特征融合注意力机制

基金：

国家自然科学基金陕西省自然科学基金西安邮电大学研究生创新基金

项目编号：

620713792019JM-604CXJJZL2022003

出版年：

2024

DOI：

10.37188/OPE.20243205.0727

光学精密工程

中国科学院长春光学精密机械与物理研究所中国仪器仪表学会

光学精密工程

CSTPCD北大核心

影响因子：2.059

ISSN：1004-924X

年,卷(期)：2024.32(5)

参考文献量33