Traffic anomaly event analysis method for highway scenes based on multimodal large language models
To address the limitations of current traffic anomaly detection systems,which lack deep incident perception capabilities,and to address the high cost of manual review for alarmed incidents,a highway traffic anomaly analysis method based on multimodal large language models(MLLM)was researched.Three MLLM-based tasks were designed and validated:first,automatically generating detailed work order descriptions for anomalous events,enhancing the depth of event perception depth;second,reviewing alarm events using MLLM,reducing false alarms and improving detection accuracy;and third,generating descriptive narratives for anomaly event videos based on MLLM,enhancing the interpretability of events.Experimental results demonstrated that the MLLM-based work order description method improved work order information completeness and accuracy through the construction of visual instruction-tuned datasets and model fine-tuning.In the review of alarm events,MLLM effectively filtered out false alarms caused by poor image quality,false positives,and misclassifications,thus reducing manual review costs.Furthermore,the MLLM-based video description method enabled efficient anomaly analysis by sampling and describing event video frames,thus improving event explainability.Although open-source models were slightly inferior to closed-source models in specific scenarios,both types demonstrated the ability to review various false alarm issues,confirming the potential application of MLLM in anomaly event reviews.This study provides a novel solution for intelligent traffic monitoring systems,enhancing the automation and practicality of handling anomaly events.
multimodal large language modelssurveillance videoanomaly event detectionvideo understandingwork order descriptiontraffic event review