农业工程学报2024,Vol.40Issue(6) :248-257.DOI:10.11975/j.issn.1002-6819.202312068

基于改进Oriented R-CNN的旋转框麦穗检测与计数模型

Improved Oriented R-CNN-based model for oriented wheat ears detection and counting

于俊伟 陈威威 郭园森 母亚双 樊超
农业工程学报2024,Vol.40Issue(6) :248-257.DOI:10.11975/j.issn.1002-6819.202312068

基于改进Oriented R-CNN的旋转框麦穗检测与计数模型

Improved Oriented R-CNN-based model for oriented wheat ears detection and counting

于俊伟 1陈威威 2郭园森 2母亚双 1樊超1
扫码查看

作者信息

  • 1. 河南工业大学粮食信息处理与控制教育部重点实验室,郑州 450001;河南工业大学河南省粮食光电探测与控制重点实验室,郑州 450001;河南工业大学人工智能与大数据学院,郑州 450001
  • 2. 河南工业大学粮食信息处理与控制教育部重点实验室,郑州 450001;河南工业大学河南省粮食光电探测与控制重点实验室,郑州 450001;河南工业大学信息科学与工程学院,郑州 450001
  • 折叠

摘要

为对干扰、遮挡等复杂的田野环境中麦穗进行精准定位与计数,该研究提出了一种改进的Oriented R-CNN麦穗旋转框检测与计数方法,首先在主干网络中引入跨阶段局部空间金字塔(spatial pyramid pooling cross stage partial networks,SPPCSPC)模块扩大模型感受野,增强网络感知能力;其次,在颈网络中结合路径聚合网络(PANet,path aggregation network)和混合注意力机制(E2CBAM,efficient two convolutional block attention module),丰富特征图包含的特征信息;最后采用柔性非极大值抑制算法(Soft-NMS,soft-non maximum suppression)优化预测框筛选过程.试验结果显示,改进的模型对复杂环境中的麦穗检测效果良好.相较原模型,平均精确度均值mAP提高了 2.02个百分点,与主流的旋转目标检测模型 Gliding vertex、R3det、Rotated Faster R-CNN、S2anet 和 Rotated Retinanet 相比,mAP 分别提高了 4.99、2.49、3.94、2.25和4.12个百分点.该研究方法利用旋转框准确定位麦穗位置,使得框内背景区域面积大幅度减少,为实际观察麦穗生长状况和统计数量提供了一种有效的方法.

Abstract

An accurate detection can greatly contribute to the wheat ears in field environments.Traditional object detection models with horizontal bounding boxes cannot accurately detect the densely distributed wheat ears,particularly on the significant occlusion between ears and stalks.The high miss detection of wheat ears often occurs in the variation of illumination conditions,dense distribution,and small scales,due to the overlap of prediction bounding boxes.It is a high demand to orient the wheat ears with less noise and of large background for the high performance.In this study,an improved Oriented Region-based Convolution Neural Networks(R-CNN)model was proposed to detect and count rotated wheat ears.Firstly,the spatial pyramid pooling cross-stage partial networks(SPPCSPC)was added to the backbone network to generate the last layer of the output feature map.The sensing field was then enlarged to enhance the perceptual ability of the network;Secondly,the feature aggregation network and the efficient two convolutional block attention module(E2CBAM)hybrid attention mechanism module were introduced into the neck network to enrich the feature information in the feature map;Finally,the prediction bounding boxes were optimized using the flexible non-maximal inhibition algorithm soft-non maximum suppression(Soft-NMS),in order to optimize the predicted bounding boxes screening.The E2CBAM module was improved using the convolutional block attention module(CBAM)in the E2CA module,instead of the CAM channel attention module.The E2CA module was composed of two parallel ECA branch structures:the maximum and average pooling.Two adaptive convolution kernels were then obtained to sum.Finally,the channel assignment was weighted for the important channel information.The key feature was captured to improve the detection performance of the model.To verify the E2CBAM hybrid attention module,the path aggregation network(PANet)was introduced into the neck network to enrich the semantic and target location in the feature map.The detection accuracy of the model was then improved by 0.19 percentage points.Furthermore,the detection accuracy was improved by 0.16 and 0.31 percentage points,whereas,the number of parameters increased by 0.24 and 0.20 M.respectively,in the CBAM and E2CBAM hybrid attention mechanism module.The floating-point computation remained unchanged.Compared with the CBAM,the E2CBAM hybrid attention mechanism module improved the detection accuracy of the model by 0.15 percentage points,while reducing the number of parameters by 0.04 M with the unchanged computation.The experimental results show that the improved Oriented R-CNN model accurately represented the head direction of wheat ears,indicating better detection performance.The mean mAP of average accuracy was 2.02 percentage points higher than the original model,compared with the mainstream-oriented bounding boxes detection models.Moreover,the mAP values were improved by 4.99,2.49,3.94,2.25,and 4.12 percentage points,respectively,compared with the mainstream rotating target detection models,Gliding vertex,R3det,Rotated Faster R-CNN,S2anet,and Rotated Retinanet.The Oriented R-CNN was utilized to accurately represent the head direction of wheat ears.The background area was also reduced in the prediction bounding boxes.The model detection was more visually appealing.The finding can provide an effective way for the practical observation of the growth status of wheat ears and counting the number of ears.

关键词

图像识别/作物/注意力机制/麦穗/Oriented/R-CNN

Key words

image recognition/crops/attention mechanism/wheat ear/Oriented R-CNN

引用本文复制引用

基金项目

国家自然科学基金青年基金(62006071)

河南省科技攻关计划(2021)(212102210152)

河南工业大学粮食信息处理中心开放课题(KFJJ2023004)

出版年

2024
农业工程学报
中国农业工程学会

农业工程学报

CSTPCDCSCD北大核心
影响因子:2.529
ISSN:1002-6819
被引量1
参考文献量14
段落导航相关论文