首页|一种基于生成对抗架构的目标检测增强算法

一种基于生成对抗架构的目标检测增强算法

扫码查看
目标检测网络的性能往往受制于特征提取网络的深度,而网络参数的大量增加只能带来检测系统性能的少量提升,同时需要引进许多额外的网络细节设计,这些都会导致训练难度的增加.本文提出了一种基于生成对抗训练的目标检测方法,它以减少特征分布的EM距离(Wasserstein距离)为训练目标.具体来说,我们将检测网络从整个架构中提取出来,并对特征提取网络进行深入的对抗性训练.实验证明,本文提出的架构进一步提高了网络的特征提取能力,并且没有导致参数的增加.在MS COCO 2017数据集上,本文的架构将基于ResNet101的CenterNet 网络性能从 36.1%mAP 提高到 37.2%mAP,将基于 Hourglass-104 的 mAP 从 42.2%提高到 43.0%.
An Enhanced Algorithm for Object Detection Based on Generative Adversarial Structure
The performance of object detection networks is often limited by the depth of the feature extraction network.Increasing network parameters may yield limited improvements in the detection system's performance.Additional careful designs of network details are necessary,but they can significantly increase training difficulty.In this paper,generative adversarial networks are used as a method to further enhance the feature extraction capability of the network.In the normal architecture,by leveraging generative adversarial networks(GANs),it becomes possible to approximate the target distribution of a given task.This approach seeks a"near correct answer"by iteratively optimizing a non-convex game with continuous high-dimensional parameters.The generator and discriminator within the GAN framework strive to achieve a Nash equilibrium,resulting in an effective solution for the task at hand.In GANs,gradient descent is commonly employed to handle losses on both the generator and discriminator sides.In this paper,it is experimentally demonstrated that feature-highlighted images have similar feature distributions as unprocessed images,while the evolution of such feature distributions exhibits a continuous and learnable change as the degree of feature highlighting changes.Therefore,this paper introduces a new object detection method using generative adversarial training,which utilizes the ability of generative adversarial networks to fit feature distributions to enhance our object detection network.Our approach focuses on minimizing the EM distance(Wasserstein distance)of the feature distri-bution,using features acquired with technically processed images as a benchmark to create a target distribution for the generative adversarial network.The features obtained from the original images will be considered as false information in generative adversarial,and the process of adver-sarial training will continuously improve the feature extraction capability of the network to obtain more realistic features,thus improving the target detection capability.Simultaneously,due to enhanced image features,the training of GAN(Generative Adversarial Network)yields a feature distribution that exceeds that of the original dataset,which allows additional gains to be obtained more easily than the usual training methods.A new loss function is also added during adversarial training to ensure steady improvement of the detector by constantly checking the object detection performance of the network.A comparative experiment conducted with the original CenterNet network on MS COCO(Microsoft Common Objects in COntext)2017 reveals that the generative adversarial training method significantly improves the average precision for most of the examined backbone networks,while ensuring that there is no increase in the inference complexity of the network.Among the four backbone networks employed in the experiments,the mean improvement in network AP(Average Precision)values ranged from 0.3 to 0.9,demonstrating their success with minimal training efforts.Moreover,none of the four backbone networks experienced an increase in network parameters during inference.Experimental results indicate that the proposed architecture effectively enhances the network's feature extraction capability without compromising speed during inference.

computer visionobject detectiongenerate adversarial trainingfeature extractionclassification prediction

张昀、黄橙、施健、张玉瑶、黄经纬、于舒娟、黄丽亚

展开 >

南京邮电大学电子与光学工程学院、柔性电子(未来技术)学院 南京 210000

计算机视觉 目标检测 生成对抗训练 特征提取 分类预测

国家自然科学基金

61977039

2024

计算机学报
中国计算机学会 中国科学院计算技术研究所

计算机学报

CSTPCD北大核心
影响因子:3.18
ISSN:0254-4164
年,卷(期):2024.47(3)
  • 49