首页|基于规则+词典+条件随机场的中医医案实体识别研究

基于规则+词典+条件随机场的中医医案实体识别研究

扫码查看
目的:针对中医医案中实体具有边界不清晰、类别易混淆等难点,提出了基于规则、词典、条件随机场相结合的实体识别模型。方法:构建中医术语词典,分析医案文本规则,构建特征函数,使用jieba工具对中医医案进行分词,人工标注医案中的5类实体作为训练集和验证集,实现基于条件随机场的医案实体识别研究;最后采用准确率、召回率、F1值对模型进行评价,以探究词典、不同实体类别、文本特征对实体识别结果的影响。结果:模型F1值达到了83。5%,实现了较好的识别效果;词典的加入对实体识别有着显著的促进作用;上下文特征对于模型识别效果影响最大;不同类别的实体识别结果差异较大,其中"方药"的识别效果最好,其次是"治法"和"体征","证型"与"症状"的识别效果最差。结论:本研究提供了一个有效的实体识别模型,这种方法能够极大地提高中医医案实体识别的准确度,也为未来的研究提供了有价值的参考。
Research on Entity Recognition of Traditional Chinese Medicine Medical Cases Based on Rules,Dictionaries,and Conditional Random Fields
Objectives:To address the challenges of unclear boundaries and easily confused categories of entities in traditional Chinese medicine(TCM)case records,a combined entity recognition model based on rules,dictionaries,and conditional random fields(CRF)is proposed.Methods:A Chinese medicine terminology dictionary was constructed,the texts rules of medical case were analyzed,and the feature functions were constructed.Word segmentation of TCM medical records was performed by the jieba tool.Five categories of entities in medical records cases were manually labeled as training and validation sets to implement research on medical case entity recognition based on CRF.Finally,the CRF model was evaluated using accuracy,recall,and F1 value to investigate the impact of dictionaries,different entity categories,and text features on entity recognition results.Result:The F1 value of the model reached 83.5%,achieving good recognition performance.The addition of dictionaries has a significant promoting effect on entity recognition.The contextual features have the greatest impact on the recognition performance of the model.There are significant differences in the recognition results of entities of different categories,among which"prescription"has the best recognition effect,followed by"treatment"and"physical signs",and"syndrome type"and"symptom"have the worst recognition effect.Conclusion:This study provides an effective entity recognition model,which can greatly improve the accuracy of entity recognition in traditional Chinese medicine medical records and provide valuable references for future research.

traditional Chinese medicine case recordsnamed entity recognitiontraditional Chinese medicine terminology dictionaryconditional random fieldfeature functionsintelligent traditional Chinese medicine

谭世雨、杜志慧、余江维

展开 >

贵州中医药大学基础医学院,贵州 贵阳 550025

中医医案 命名实体识别 中医术语词典 条件随机场 特征函数 中医药智能化

贵州省中医药管理局中医药、民族医药科学技术研究课题

QZYY-2022-003

2024

中医药导报
湖南省中医药学会 湖南省中医管理局

中医药导报

CSTPCD
影响因子:0.952
ISSN:1672-951X
年,卷(期):2024.30(6)
  • 16