首页|面向淋巴水肿疾病的电子病历命名实体识别应用研究

面向淋巴水肿疾病的电子病历命名实体识别应用研究

扫码查看
目的/意义 探讨人工智能技术应用于淋巴水肿患者电子病历非结构化文本数据的关键实体识别问题.方法/过程 阐述样本稀缺背景下模型微调训练的解决方案,选取首都医科大学附属北京世纪坛医院淋巴外科既往收治患者594 例为研究对象,依据临床医生标注的 15 种关键实体类别,微调GlobalPointer模型的预测层,借助其全局指针识别嵌套和非嵌套的关键实体.分析实验结果的准确性和临床应用可行性.结果/结论 微调后模型总体精准率、召回率和 Macro_F1 均值分别为 0.795、0.641 和 0.697,为淋巴水肿电子病历数据精准挖掘奠定基础.
Study on the Application of Named Entity Recognition in Electronic Medical Records for Lymphedema Disease
Purpose/Significance The paper discusses the application of artificial intelligence technology to the key entity recognition ofunstructured text data in the electronic medical records of lymphedema patients.Method/Process It expounds the solution of model fine-tuning training under the background of sample scarcity,a total of 594 patients admitted to the department of lymphatic surgery of Beijing Shijitan Hospital,Capital Medical University are selected as the research objects.The prediction layer of the GlobalPointer model is fine-tuned according to 15 key entity categories labeled by clinicians,nested and non-nested key entities are identified with its glob-al pointer.The accuracy of the experimental results and the feasibility of clinical application are analyzed.Result/Conclusion After fine-tuning,the average accuracy rate,recall rate and Macro_F1 ofthe model are 0.795,0.641 and 0.697,respectively,which lay a foundation for accurate mining of lymphedema EMR data.

lymphedemaelectronic medical recordsnamed entity recognitionnatural language processingmedicine

汤昊宬、苏万春、冀秀元、信建峰、夏松、孙宇光、徐毅、沈文彬

展开 >

中国科学院自动化研究所 北京 100190

首都医科大学附属北京世纪坛医院 北京 100038

淋巴水肿 电子病历 命名实体识别 自然语言处理 医学

科技创新2030——"新一代人工智能"重大项目北京市科学技术委员会项目

2020AAA0105005Z191100007619049

2024

医学信息学杂志
中国医学科学院

医学信息学杂志

CSTPCD
影响因子:1.348
ISSN:1673-6036
年,卷(期):2024.45(2)
  • 12