To address the limitations of current traffic anomaly detection systems,which lack deep incident perception capabilities,and to address the high cost of manual review for alarmed incidents,a highway traffic anomaly analysis method based on multimodal large language models(MLLM)was researched.Three MLLM-based tasks were designed and validated:first,automatically generating detailed work order descriptions for anomalous events,enhancing the depth of event perception depth;second,reviewing alarm events using MLLM,reducing false alarms and improving detection accuracy;and third,generating descriptive narratives for anomaly event videos based on MLLM,enhancing the interpretability of events.Experimental results demonstrated that the MLLM-based work order description method improved work order information completeness and accuracy through the construction of visual instruction-tuned datasets and model fine-tuning.In the review of alarm events,MLLM effectively filtered out false alarms caused by poor image quality,false positives,and misclassifications,thus reducing manual review costs.Furthermore,the MLLM-based video description method enabled efficient anomaly analysis by sampling and describing event video frames,thus improving event explainability.Although open-source models were slightly inferior to closed-source models in specific scenarios,both types demonstrated the ability to review various false alarm issues,confirming the potential application of MLLM in anomaly event reviews.This study provides a novel solution for intelligent traffic monitoring systems,enhancing the automation and practicality of handling anomaly events.
关键词
多模态大模型/监控视频/异常事件检测/视频理解/工单描述/交通异常事件审核
Key words
multimodal large language models/surveillance video/anomaly event detection/video understanding/work order description/traffic event review