首页|基于K-BERT的中文妇产科电子病历实体识别研究

基于K-BERT的中文妇产科电子病历实体识别研究

扫码查看
针对利用预训练模型进行中文妇产科电子病历命名实体识别时,BERT缺乏一定的医疗领域专业知识而导致其识别性能下降的问题,提出了一种基于知识图谱的预训练模型 K-BERT的命名实体识别模型K-BERT-BiLSTM-CRF.通过K-BERT预训练模型获取包含医学背景知识的语义特征向量,利用双向长短时记忆网络(BiLSTM)与条件随机场(CRF)提取上下文相关特征并且解决标签偏移问题,完成实体识别.利用真实妇产科医疗电子病历数据集进行训练,K-BERT-BiLSTM-CRF模型的F1值达到了 90.04%.实验表明,相比一般BERT的模型,K-BERT-BiLSTM-CRF命名实体识别模型在中文妇产科电子病历领域上的表现更优异,识别效果更好.
Research on Entity Recognition of Chinese Obstetrics and Gynecology Electronic Medical Records Based on K-BERT
When the pre-trained model is used to name entity recognition of Chinese obstetrics and gynecology electronic medical records,BERT lacks certain professional knowledge in the medical field,which leads to the decline of its recognition performance.A pre-trained model based on knowledge graph-K-BERT name entity recognition model K-BERT-BiLSTM-CRF is proposed.The K-BERT pre-training model is used to obtain the semantic feature vector containing the medical background knowledge,and the bidirectional long short-term memory network(BiLSTM)and conditional random field(CRF)are used to extract the context-related features and solve the label offset problem to complete the entity recognition.Using the real obstetrics and gynecology medical electronic medical record data set for training,the F1 value of the K-BERT-BiLSTM-CRF model reached 90.04%.Experiments show that compared with the general BERT model,the K-BERT-BiLSTM-CRF name entity recognition model performs better in the field of Chinese obstetrics and gynecology electronic medical records,and the recognition effect is better.

K-BERTBidirectional long short-term memoryConditional random fieldsObstetrics and gynecology electronic medical recordsName entity recognition

张由、李舫

展开 >

上海电力大学计算机科学与技术学院,上海 201306

K-BERT 双向长短时记忆网络 条件随机场 妇产科电子病历 命名实体识别

2024

医学信息
国家卫生部信息化管理领导小组 中国电子学会中国医药信息学分会 陕西文博生物信息工程研究所

医学信息

影响因子:0.161
ISSN:1006-1959
年,卷(期):2024.37(1)
  • 4