To address the challenges of precise localization of targets in optical remote sensing images and conflict between classification and localization features in the detection head,a remote sensing image target detection method based on Deformable Transformer and adaptive detection head is proposed.First,we design a feature extraction network based on feature fusion and Deformable Transformer.The feature fusion module enriches the semantic information of shallow convolution neural network features,and the Deformable Transformer establishes dependencies on distant features.This in turn effectively captures global semantic information and improves feature representation capability.Second,an adaptive detection head based on task learning module is constructed to enhance task awareness within the detection head.It automatically learns and adjusts the feature representation for classification and localization tasks,and thereby,mitigates feature conflicts.Finally,the L1-IoU loss is proposed as a localization loss function to provide a more accurate assessment of localization error between candidate boxes and ground truth boxes during training,thereby improving the accuracy of object localization.The effectiveness of the proposed method is evaluated on high-resolution remote sensing datasets,NWPU VHR-10 and RSOD.The results show significant improvements when compared to other methods.
remote sensing imagetarget detectionDeformable Transformertask learning moduleadaptive detection headL1-IoU loss
彭浩康、葛芸、杨小雨、胡昌泉
展开 >
南昌航空大学软件学院,江西 南昌 330063
江西慧航工程咨询有限公司,江西 南昌 330038
遥感图像 目标检测 Deformable Transformer 任务学习模块 自适应检测头 L1-IoU loss