Rotating target detection network that combines key points and guide vectors
Objective Optical remote sensing images objectively and accurately record the implementation of surface fea-tures and are widely used in the investigation,detection,analysis,and prediction forecasting of resources,environment,disasters,regions,and cities.The primary task of optical remote sensing image object detection is to locate and classify objects in the input remote sensing images with important values for research and application in the field of Earth observa-tion.Traditional remote sensing object detection algorithms require manually designed features.However,features designed in this manner are limited,and consume considerable human and material resources but are not generalized and accurate enough to be improved.With the rapid development of deep learning in recent years,remote sensing object detec-tion algorithms based on deep learning have achieved good results in optical image object detection.In contrast with object detection in natural scenes,objects in optical remote sensing images are rigid and most of them have key information,such as direction.Horizontal rectangular detection frames in natural scenes have problems in the field of optical remote sensing object detection,such as excessive background area,overlapping adjacent detection frames,and loss of object motion information.To achieve more accurate object detection in optical remote sensing images,a rotating rectangular frame that fits object contour is a more suitable choice.The detection of rotating remote sensing objects through the discovery of key points is one of the current mainstream approaches.However,these key point-based object detection algorithms tend to have problems,such as the overlapping of adjacent key points and inaccurate key point detection,due to the dense arrange-ment of remote sensing objects.To solve these key point regression problems,this study proposes an improved rotating elliptic Gaussian kernel with vector-guided point pair matching module,which achieves high-precision rotating object detection through the accurate prediction and matching of object centroids and head vertices.Method An hourglass net-work is different from the general feature extraction network,because its structure can fuse high-level features with rich semantic information and underlying features with rich spatial information.The generated high-resolution feature map can achieve the precise location of key points.The circular Gaussian kernel that returns key points in natural scenes exhibits the problems of uncertainty of Gaussian kernel radius and the overlapping of Gaussian kernels for densely arranged objects in remote sensing image object detection.The rotating elliptical Gaussian kernel proposed in this study solves the aforemen-tioned problems.It is particularly constructed in such a way that the long and short axes of the elliptical Gaussian kernel are determined by the length and width of the rotating rectangular box of the object and the angle of the long axis of the ellipse is the same as the angle of the object.This rotated elliptical Gaussian kernel fits the shape of the object more closely,achieving better key point regression effect.In this study,the two key points of the object(i.e.,the center point and the head vertex)are modeled as the core,and a point pair matching module that uses bootstrap vectors is proposed to achieve the exact pairing of the center point and the head vertex of the same object.Result Our model is evaluated on the HRSC2016 and UCAS-AOD public datasets.The HRSC2016 dataset has 436 training images,181 validation images,and 444 test images,with image sizes ranging from 300 × 300 to 1 500 × 900.The UCAS-AOD dataset has image sizes of 1 280 x 659,with 1 000 aircraft images and 510 vehicle images,including 7 482 aircraft objects and 7 114 vehicle objects.The annotations in the HRSC dataset contain the head vertices.The annotations of the aircraft category in the UCAS-AOD data-set contain the specific orientation angles of the objects,and thus,the head vertices of aircraft can be calculated.During the experiment,images of various sizes were cropped and deflated to 640 × 640 resolution and inputted into the network model.Four Nvidia RTX 2080Ti graphics cards were used,with a batch size of eight images per card and an initial learn-ing rate set to 0.01.The optimizer for training was the stochastic gradient descent method with a momentum factor set to 0.9.Bcfore training,the dataset was augmented through flipping and rotation.Recall,accuracy,and average precision are used as the evaluation metrics of the model.The experimental results on the HRSC dataset with large-aspect-ratio ship objects show that the proposed algorithm achieves better detection results than the other mainstream object detection algo-rithms,with an average accuracy of 90.78%(VOC 2007)and 97.85%(VOC 2012),and the precision-recall curves are also better than those of the other object detection algorithms.Conclusion Our experimental results show that the rotating object detection model that combines key points and bootstrap vectors is excellent and advanced.The rotating elliptic Gaussian kernel achieves more accurate key point regression,and the point pair matching module based on bootstrap vec-tors achieves accurate matching of centroids and head vertices,improving the detection of rotating objects.