首页|基于RoBERTa-WWM模型的中文电子病历命名实体识别研究

基于RoBERTa-WWM模型的中文电子病历命名实体识别研究

扫码查看
在应对中文电子病历文本分析时,面临着一词多义、识别不完整等挑战.为此,构建了 RoBERTa-WWM 模型与BiLSTM-CRF模块相结合的深度学习框架.首先,将经过预训练的RoBERTa-WWM 语言模型与Transformer层产生的语义特征进行深度融合,以捕获文本的复杂语境信息.接着,将融合后的语义表示输入至BiLSTM以及CRF模块,进一步细化了实体的辨识范围与准确性.最后,在CCKS2019数据集上进行了实证分析,F1值高达82.94%.这一数据有力地证实了RoBERTa-WWM-BiL-STM-CRF模型在中文电子病历命名实体的识别工作上的优越性能.
Research on named entity recognition of Chinese electronic medical records
When dealing with the text analysis of Chinese electronic medical records,we are faced with the challenges of polysemy and incomplete recognition.Therefore,a deep learning frame-work combining RoBERTa-WWM model and BiLSTM-CRF module is constructed.First,the pre-trained RoBERTa-WWM language model is deeply integrated with the semantic features generated by the Transformer layer to capture complex contextual information of the text.Then,the fusion semantic representation is input into BiLSTM and CRF modules to further re-fine the identification range and accuracy of entities.Finally,an empirical analysis was carried out on the CCKS2019 dataset,and the value was as high as 82.94%.This data strongly confirms the superior performance of RoBERTa-WWM-BiLSTM-CRF model in the recognition of named entities in Chinese electronic medical records.

RoBERTa-WWM modelChinese electronic medical recordsentity recognition

刘慧敏、黄霞、熊菲、王国庆

展开 >

昆明医科大学海源学院,云南昆明 650000

RoBERTa-WWM模型 中文电子病历 实体识别

昆明医科大学海源学院科学研究基金

2022HY014

2024

长江信息通信
湖北通信服务公司

长江信息通信

影响因子:0.338
ISSN:2096-9759
年,卷(期):2024.37(3)
  • 9