融合局部自注意力的YOLOv5遥感图像检测

Remote sensing image detection of YOLOv5 with local self-attention

孙光灵 ¹朱玉敏 ²李艳秋²

扫码查看

作者信息

1. 安徽建筑大学电子与信息工程学院,安徽合肥 230601;合肥工业大学智能互联系统安徽省实验室,安徽合肥 230009
2. 安徽建筑大学电子与信息工程学院,安徽合肥 230601
折叠

摘要

针对遥感图像普遍存在背景复杂、小目标多、特征提取难等问题,本文提出了一种改进的上下文通道变换器(Contextual and Simple Squeeze-and-excitation Transformer,CoST)风格模块算法.这种设计充分利用相邻Key之间的语义信息,引导注意力矩阵动态学习,同时引入简化版通道注意力(Simple Squeeze-and-excitation,SSE),促进了与当前任务相关的特征图通道的形成,抑制了与当前任务关系不大的特征通道,从而增强了视觉表征的能力.实验在目标检测(YO-LOv5s)框架上进行,在PASCAL VOC和DIOR数据集上进行评估,结果表明:改进后的YOLOv5s在模型在VOC以及在遥感数据集DIOR上平均精度分别提升了2.4%和 1.4%,验证了模块CoST的有效性.

Abstract

In response to the common problems of complex background,multiple small targets,and difficult feature extrac-tion in remote sensing images,this paper proposes an improved Context and Simple Squeeze and Excitation Transformer(CoST)Transformer style module algorithm.This design makes full use of the semantic information between adjacent keys to guide the dynamic learning of the attention matrix,and at the same time introduces the simplified channel attention(Simple Squeeze-and-excitation,SSE),which promotes the formation of feature map channels related to the current task and inhibits the feature chan-nels that have little to do with the current task,thereby enhancing the ability of visual representation.By performing experiments on the object detection(YOLOv5s)framework and experimental evaluation on PASCAL VOC and DIOR datasets,the average accuracy of the improved YOLOv5s on the model on VOC and the remote sensing dataset DIOR is improved by 2.4%and 1.4%,respectively,which verifies the superiority of the module CoST.

关键词

深度学习/目标检测/YOLOv5s/自注意力机制

Key words

deep learning/object detection/YOLOv5s/self-attention mechanism

引用本文复制引用

基金项目

国家自然科学基金资助项目(62001004)

安徽省高等学校自然科学研究重点项目(2023AH050164)

安徽省高校协同创新项目(GXXT-2021-024)

安徽省住房城乡建设科学技术计划项目(2023-YF058)

安徽省住房城乡建设科学技术计划项目(2023-YF113)

合肥工业大学开放基金(PA2021AKSK0107)

出版年

2024

阜阳师范大学学报(自然科学版)

阜阳师范学院

阜阳师范大学学报(自然科学版)

影响因子：0.263

ISSN：1004-4329

段落导航