Exploring millimeter wave radar data as complementary to RGB images for ameliorating 3D object detection has become an emerging trend for autonomous driving systems.However,existing radar-camera fusion methods are highly dependent on the prior camera detection results,rendering the overall performance unsatisfactory.In this paper,we pro-pose a bidirectional fusion scheme in the bird-eye view(BEV-radar),which is independent of prior camera detection res-ults.Leveraging features from both modalities,our method designs a bidirectional attention-based fusion strategy.Spe-cifically,following BEV-based 3D detection methods,our method engages a bidirectional transformer to embed informa-tion from both modalities and enforces the local spatial relationship according to subsequent convolution blocks.After em-bedding the features,the BEV features are decoded in the 3D object prediction head.We evaluate our method on the nuS-cenes dataset,achieving 48.2 mAP and 57.6 NDS.The result shows considerable improvements compared to the camera-only baseline,especially in terms of velocity prediction.The code is available at https://github.com/Etah0409/BEV-Radar.
关键词
三维目标检测/传感器融合/毫米波雷达
Key words
3D object detection/sensor fusion/millimeter wave radar