LiDAR MOT-DETR: A LiDAR-based Two-Stage Transformer for 3D Multiple Object Tracking

扫码查看

原文链接

NETL
NSTL

外文摘要：Multi-object tracking from LiDAR point clouds, presents unique challenges due to the sparse and irregular naiun of the data, compounded by the need for lempoml coherence across frames. Traditional tracking systems often rely on fewer features, motion models and Kalman filtering, which can struggle to maintain consistent object identities in crowded or fast-moving scenes. We present a lidar based two-staged DETR inspired transformer, a smoother and tracker. Firstly, we rely on existing LiDAR object detections across a moving lime window and train a transformer regression model taking the role of a smoother, estimating bounding boxes per frame. This smoother acts as a preprocessing step for filtering out relevant bounding boxes which are missing due to short-tenn occlusions and ambiguity using temporally bi-directional inputs with respect to the key-frame. On top of that, we replace the heuristics method of tracking a with a DETR transformer architecture where tracks are propagated in subsequent frames and assixiated via attention mechanisms to ensure consistency. The model is pit-trained on a large-scale autonomous driving dalasel and fine-tuned on nuScenes. demonstrating strong performance across metrics such as ID-switch and multiple object tracking accuracy (MOTA). Numerical results indicate that our method outperforms baseline models on the nuScenes dataset with an aMOTA and aMOTP of 0.752 and 0.489 on validation set and a 0.722 and 0.445 on the test set respectively. This tw<t-staged transformer-based framework establishes a fouttdation for more ad-ranced MOT systems in LiDAR 3D perception, paving the way for enhanced autonomy in applications ranging from self-driving cars to robotic navigation.

出版年：

2025

Research Disclosure

ISSN：0374-4353

年,卷(期)：2025.(734)