3D Object Detection Method with Image Semantic Feature Guidance and Cross-Modal Fusion of Point Cloud
Due to the complexity of scenes,the influence of object scale changes and occlusions etc.,object de-tection still face many challenges.Cross-modal feature fusion of image and laser point cloud information can ef-fectively improve the performance of 3D object detection,but the fusion effect and detection performance still need to be improved.Therefore,this paper first designs an image semantic feature learning network,which adopts a position and channel dual-branch self-attention parallel computing method,achieves global semantic enhance-ment,to reduce target misclassification.Secondly,a local semantic fusion module with image semantic feature guidance is proposed,which uses element-level data splicing to guide and fuse point cloud data with the local semantic features of the retrieved images,so as to better solve the problem of semantic alignment in cross-modal information fusion.A multi-scale re-fusion network is proposed,and the interaction module between the fusion features and LiDAR is designed to learn multi-scale connections in fusion features and re-fusion between features of different resolutions,so as to improve the detection performance.Finally,four task losses are adopted to per-form anchor-free 3D multi-object detector.Comparing with other methods in KITTI and nuScenes datasets,the detection accuracy for 3D objects is 87.15%,and the experimental results show that the method in this paper out-performs the comparison methods and has better 3D detection performance.
3D object detectioncross-modalsemantic featurepoint cloudanchor-free