摘要
水鸟监测是了解水鸟种群和分布动态、开展水鸟和湿地保护的基础,但该活动耗时耗力.近年来,随着无人机遥感技术的发展,使用小型无人机获得高分辨率的水鸟遥感影像已经成为可能;与此同时,卷积神经网络提供了一种快速识别无人机遥感图像中的鸟类的方法.我们尝试结合两种技术,使用卷积神经网络Mask R-CNN与YOLOv3识别湖南西洞庭湖国家级自然保护区无人机遥感影像中的大型水鸟,取得了良好的效果:模型检测拍摄到的鸭属鸟类,包括绿翅鸭(Anas crecca)和罗纹鸭(A.falcata)的结果平均精度达到0.93,精度达到90.83%,召回率达到93%;检测小天鹅(Cygnus columbianus)的结果平均精度达到0.91,精度达到84.38%,召回率达到84.00%.结果表明,将无人机遥感技术与卷积神经网络结合,可以快速统计水鸟数量,在种群监测工作中具有应用潜力.
Abstract
[Objectives]Waterbird monitoring plays a crucial role in understanding population dynamics and guiding conservation efforts,but it has traditionally been a time-consuming process.In this study,our objective is to integrate unmanned aerial vehicle(UAV)remote sensing with convolutional neural networks(CNN)to achieve rapid and accurate estimation of waterbird populations.[Methods]We employed the DJI Mavic 2 Zoom UAV to capture high-resolution remote sensing images in the West Dongting Lake National Nature Reserve in Hunan.The UAV was flown at an altitude of 75 m,with its camera positioned in a vertically downward-facing orientation.We obtained images with a ground resolution of 1.2 cm/pixel,Table 1 displays the waterbirds captured in the images.We selected 503 images to construct a dataset,including two categories:Anas crecca/A.falcata and Cygnus columbianus,with 3 778 and 395 samples respectively.The dataset has several training sets of different sizes(Table 2)and a validation set of 3 032 samples.For each training set,we independently developed Mask R-CNN and YOLOv3 models,evaluating their performance using the validation set.Evaluation metrics include average precision,recall,precision,and Fl-score.[Results]When identifying A.crecca/A.falcata,Mask R-CNN model achieved a recall rate of 93.00%and a precision of 90.83%(Table 4,Fig.4),while the YOLOv3 model achieved a recall rate of 93.00%and a precision of 88.79%(Table 5,Fig.5).After reaching 178indforA.crecca/A.falcata in the training set,further augmentation did not result in a significant improvement in the performance of both models.When identifying C.columbianus,the performance of both models improved with an increase in the size of the training set.The Mask R-CNN model achieved a recall rate of 84.00%and a precision of 84.38%(Table 6,Fig.6),while the YOLOv3 model achieved a recall rate of 90.00%and a precision of 81.69%(Table 7,Fig.7).The Mask R-CNN model detected images at a speed of approximately 12 images/s,while the YOLOv3 model detected images at a speed of 20-30 images/s.[Conclusion]Our study proposes a potential solution for efficient and accurate waterbird population monitoring in natural habitats.Our models demonstrated high accuracy in identifying A.crecca/A.falcata,the recognition accuracy difference between Mask R-CNN and YOLO was minimal.Remarkably,by integrating UAV remote sensing with CNN,our approach demonstrates the potential for training highly efficient and accurate waterbird identification models with minimal annotated data—perhaps requiring fewer than 250 ind per waterbird species,as suggested by our results.