3D information detection method for facility greenhouse tomato based on improved YOLOv5l
To solve the problem of inaccurate fruit recognition and positioning caused by obstruction and complex light conditions in greenhouse environments.This study combines the deep learning object detection algorithm with the Intel RealSense D435i depth camera.And we propose a method to obtain the coordinated position of the tomato in three-dimensional space,which is used for the picking robot in the greenhouse to perform the tomato positioning and picking task.Based on the YOLOv5 network,we use Ghost-Convolution to replace the CSP structure in the original network.And we adopted the multi-scale connection method of BiFPN to maximize the use of the tomato feature information extracted by different feature layers to improve the accuracy of bounding box regression.This article compared different attention mechanisms and selected the CBAM Attention mechanism to insert into the model's feature extraction network.Then,the model obtains the center point of the tomato detected in the two-dimensional video stream data through the RGB-D camera and calculates the tomato's spatial coordinate information in the camera coordinate system.To minimize the impact of the complex greenhouse environment on target recognition and the final picking effect,we filter all video streams over 1.5 meters so that the vision algorithm only focuses on the recognition and detection of targets within a range of 1.5 meters.The mean average precision of red and green tomatoes was 82.4%and 82.2%.Finally,this article introduces a method for combining a depth camera with an object detection network to detect the depth of tomato objects.It provide theoretical support for the tomato picking robot vision system.