Task Feature Decoupling Model for Autonomous Driving Visual Joint Perception
This paper proposes a task feature decoupling-based autonomous driving visual joint perception(TFDJP)model to address the issues of internal competition and rough edge segmentation in detection tasks.These issues arise because existing autonomous driving visual joint perception algorithms that use coupled decoding networks do not consider the different feature requirements of each subtask.For object detection and decoding,we designed a hierarchical semantic enhancement module and a spatial information refinement module.These modules aggregate features of different semantic levels,separate and encode the gradient flow of classification and localization subtasks,reduce internal conflicts between subtasks,and add an intersection-over-union ratio perception prediction branch to the localization part for strengthening the correlation between subtasks and improving localization accuracy.For drivable area segmentation and lane detection decoding,we constructed a dual-resolution decoupling branch network to model the separation of low-frequency main area and high-frequency boundary pixels of the target.Boundary loss is used to guide the target to complete training and learning from local to global,gradually optimizing the target's main body and edges,thereby improving overall performance.Experimental results on the BDD100K dataset show that,compared to YOLOP,the proposed TFDJP model has an average target detection accuracy improvement of 2.7 percentage points,an average intersection-to-intersection ratio improvement of 1.3 percentage points for drivable area segmentation,and an accuracy improvement of 10.6 percentage points for lane detection.Compared to other multitasking models,the proposed TFDJP model effectively balances accuracy and real-time performance.