首页|面向多模态数据的高速公路交通事故信息识别与评价

面向多模态数据的高速公路交通事故信息识别与评价

扫码查看
为实现从自然语言描述的交通事故文本中提取应急处置信息,提出了一种基于预训练模型和BiLSTM-CRF的交通事故命名实体识别方法.首先,基于陕西省高速公路2021年6月至2022年8月的多模态交通事故数据,分别比较了3种深度学习模型的识别效果和训练时长.其次,利用官方微博交通事故语料作为袋外测试集,检验实体识别模型的鲁棒性.然后,从一致性和丰富性两个维度,构建了文本信息和结构化数据的多模态交通事故信息内容评价指标.最后,以测试集为例进行交通事故信息识别,分析了应急处置实体数量与事故持续时间的相关性,计算并探讨了信息内容评价指标结果.结果表明,BERT-BiLSTM-CRF在测试集和袋外测试集的加权F1值分别为97.029 4%和69.155 5%,为模型精度、训练效率和鲁棒性3个方面综合表现最优.处置机构、处置设备、未处置、处置中、处置效果的实体数量与持续时间之间的相关系数依次为0.309,0.151,0.137,0.220和0.178,呈正相关性.天气、路产损失、交通分流、事故类型和伤亡情况的信息内容一致性依次为7.06%,45.79%,1.59%,67.65%和47.59%,应急处置占为36%,变异性为1.305,说明文本信息蕴含丰富的应急处置信息,然而文本信息和结构化数据对同一交通事故的信息内容一致性尚待提高.研究结果可为提高交通事故信息采集质量和有效性提供参考.
Recognition and Evaluation on Expressway Traffic Accident Information Using Multimodal Data
In order to extract emergency response information from natural language descriptions of traffic accidents,a named entity recognition method is proposed based on pre-trained models and BiLSTM-CRF.The multimodal traffic accident data on expressways from June 2021 to August 2022 in Shaanxi province are analyzed as data sources.Firstly,3 deep learning models are compared on entity recognition effect and training time.Secondly,the traffic accident corpus from official microblog is obtained to test the robustness.Moreover,according to the dimensions of consistency and richness,the evaluation indicators are constructed to enable quantitative assessment of traffic accident content for text data and structured data.Finally,the traffic accident information recognition is carried out by using the test dataset.The result shows that the weighted Fl values of BERT-BiLSTM-CRF on both test dataset and out-of-bag dataset are 97.029 4%and 69.155 5%respectively,which have the best comprehensive performance in terms of model accuracy,training efficiency,and robustness.It is verified that there is a positive correlation between the number of emergency disposal entities and the duration of accident.The correlation coefficients of disposal agency,disposal equipment,un-disposal,disposal-ing and disposal effect are 0.309,0.151,0.137,0.220 and 0.178 respectively.The content consistency of weather,road loss,traffic diversion,accident type and casualty are 7.06%,45.79%,1.59%,67.65%and 47.59%respectively.The proportion of emergency response is 36%,and the variability is 1.305.It is proved that text data contain rich emergency disposal information,however,the content consistency of text data and structured data for the same traffic accident should be improved.The study result can provide reference for improving the quality and effectiveness of traffic accident information.

ITStraffic accidentmultimodal datapre-trained modelbi-directional long and short term memory

陈娇娜、陶伟俊、靳引利

展开 >

西安石油大学 电子工程学院,陕西 西安 710065

长安大学 电子与控制工程学院,陕西 西安 710061

智能交通 交通事故 多模态数据 预训练模型 双向长短时记忆

国家自然科学基金国家重点研发计划

520023152019YFB1600700

2024

公路交通科技
交通运输部公路科学研究院

公路交通科技

CSTPCD北大核心
影响因子:1.007
ISSN:1002-0268
年,卷(期):2024.41(4)
  • 21