医学信息学杂志2024,Vol.45Issue(2) :52-58.DOI:10.3969/j.issn.1673-6036.2024.02.009

面向淋巴水肿疾病的电子病历命名实体识别应用研究

Study on the Application of Named Entity Recognition in Electronic Medical Records for Lymphedema Disease

汤昊宬 苏万春 冀秀元 信建峰 夏松 孙宇光 徐毅 沈文彬
医学信息学杂志2024,Vol.45Issue(2) :52-58.DOI:10.3969/j.issn.1673-6036.2024.02.009

面向淋巴水肿疾病的电子病历命名实体识别应用研究

Study on the Application of Named Entity Recognition in Electronic Medical Records for Lymphedema Disease

汤昊宬 1苏万春 2冀秀元 1信建峰 2夏松 2孙宇光 2徐毅 1沈文彬2
扫码查看

作者信息

  • 1. 中国科学院自动化研究所 北京 100190
  • 2. 首都医科大学附属北京世纪坛医院 北京 100038
  • 折叠

摘要

目的/意义 探讨人工智能技术应用于淋巴水肿患者电子病历非结构化文本数据的关键实体识别问题.方法/过程 阐述样本稀缺背景下模型微调训练的解决方案,选取首都医科大学附属北京世纪坛医院淋巴外科既往收治患者594 例为研究对象,依据临床医生标注的 15 种关键实体类别,微调GlobalPointer模型的预测层,借助其全局指针识别嵌套和非嵌套的关键实体.分析实验结果的准确性和临床应用可行性.结果/结论 微调后模型总体精准率、召回率和 Macro_F1 均值分别为 0.795、0.641 和 0.697,为淋巴水肿电子病历数据精准挖掘奠定基础.

Abstract

Purpose/Significance The paper discusses the application of artificial intelligence technology to the key entity recognition ofunstructured text data in the electronic medical records of lymphedema patients.Method/Process It expounds the solution of model fine-tuning training under the background of sample scarcity,a total of 594 patients admitted to the department of lymphatic surgery of Beijing Shijitan Hospital,Capital Medical University are selected as the research objects.The prediction layer of the GlobalPointer model is fine-tuned according to 15 key entity categories labeled by clinicians,nested and non-nested key entities are identified with its glob-al pointer.The accuracy of the experimental results and the feasibility of clinical application are analyzed.Result/Conclusion After fine-tuning,the average accuracy rate,recall rate and Macro_F1 ofthe model are 0.795,0.641 and 0.697,respectively,which lay a foundation for accurate mining of lymphedema EMR data.

关键词

淋巴水肿/电子病历/命名实体识别/自然语言处理/医学

Key words

lymphedema/electronic medical records/named entity recognition/natural language processing/medicine

引用本文复制引用

基金项目

科技创新2030——"新一代人工智能"重大项目(2020AAA0105005)

北京市科学技术委员会项目(Z191100007619049)

出版年

2024
医学信息学杂志
中国医学科学院

医学信息学杂志

CSTPCD
影响因子:1.348
ISSN:1673-6036
参考文献量12
段落导航相关论文