基于自然语言处理和深度学习的急性呼吸道传染病早期识别模型的构建
Construction of early detection model for acute respiratory infectious diseases based on natural language processing and deep learning
张忆汝 1汤永 2朱敏 3谢杏 4魏宏名 5刘运喜 6马慧7
作者信息
- 1. 安徽医科大学第五临床医学院海军临床学院,安徽合肥 230032
- 2. 解放军总医院医学创新研究部,北京 100853
- 3. 解放军总医院第六医学中心疾病预防控制科,北京 100038
- 4. 华南理工大学医学院,广东广州 510006
- 5. 解放军总医院第一医学中心检验科,北京 100853
- 6. 解放军总医院第一医学中心疾病预防控制科,北京 100853
- 7. 安徽医科大学第五临床医学院海军临床学院,安徽合肥 230032;解放军总医院护理部,北京 100853
- 折叠
摘要
目的 通过深度学习算法构建急性呼吸道传染病早期识别模型,协助开展医疗机构呼吸道传染病早期识别工作.方法 收集2012年1月—2023年3月北京某大型三甲医疗机构急性呼吸道感染性疾病6 683例患者病历文本数据,使用基于自然语言处理(NLP)技术的双向编码器表征(BERT)训练词向量,结合卷积神经网络(CNN)和双向长短期记忆神经网络(BiLSTM)构建早期识别模型BERT_MCB,以受试者工作曲线、准确率、精确率、召回率和F1值等指标作为模型性能判断标准.结果 BERT_MCB模型整体优于随机森林、BERT、BERT_CNN、BERT_RNN四组基线模型,其中准确率提高了 1.20%~15.80%、精确率提高了 1.66%~23.69%、召回率提高了 0.25%~26.75%、F1值提高了 0.66%~27.25%.结论 本研究建立的急性呼吸道传染病早期识别模型可较为准确地识别出急性呼吸道传染病,表明深度学习算法在早期识别急性呼吸道传染病方面具有较好的应用前景.
Abstract
OBJECTIVE To develop an early detection model for acute respiratory infectious diseases using a deep learning algorithm,and to assist in the early identification of respiratory infectious diseases in medical institutions.METHODS Medical records of 6 683 patients with acute respiratory infections from a large tertiary medical institu-tion in Beijing from Jan 2012 to Mar 2023 were collected.We used the bidirectional encoder representations from transformers(BERT)based on natural language processing technology to train word vectors.Combining convolu-tional neural networks(CNN)and bi-directional long short-term memory(BiLSTM),we created an early detec-tion model called BERT_MCB.Its performance was evaluated based on the receiver operating curve,accuracy,re-call,and F1.RESULTS The BERT_MCB model was overall better than the random forest,BERT,BERT_CNN,and BERT_RNN models.The accuracy rate of this model increased by 1.20%-15.80%,precision rate increased by 1.66%-23.69%,recall rate increased by 0.25%-26.75%,and F1 value increased by 0.66%-27.25%.CONCLUSION The early detection model for acute respiratory infectious diseases can accurately identify acute re-spiratory infectious diseases,which showed that deep learning algorithms have promising potential in the early i-dentification of acute respiratory infections.
关键词
急性呼吸道传染病/症状监测/电子病历/深度学习/自然语言处理/早期识别模型/传染病监测预警Key words
Acute respiratory infectious disease/Symptom surveillance/Electronic medical record/Deep learning/Natural language processing/Early identification model/Infectious disease surveillance and alarming引用本文复制引用
出版年
2024