微型电脑应用2024,Vol.40Issue(4) :59-63.

CRF机制结合LDA的病历文书后结构化系统的应用

Application of CRF Mechanism Combined with LDA in the Structured System of Medical Record Documents

温煜 赖舒婷 曾菲菲 雷佳雨
微型电脑应用2024,Vol.40Issue(4) :59-63.

CRF机制结合LDA的病历文书后结构化系统的应用

Application of CRF Mechanism Combined with LDA in the Structured System of Medical Record Documents

温煜 1赖舒婷 1曾菲菲 1雷佳雨1
扫码查看

作者信息

  • 1. 梅州市人民医院,广东,梅州 514000
  • 折叠

摘要

为了提高病历文书结构化分类准确度,提出利用条件随机场(CRF)半监督词典分词算法结合隐式狄利克雷分布(LDA)医学病历文本分类算法,构建出CRF机制结合LDA的病历文书后结构化系统.结果表明:当主题数量为40时,LDA主题建模的困惑度最小值为-6.97,与初始困惑度相比,LDA主题建模困惑度下降9.76%;当主题数量为3时,得到了 一致性值的最低值为0.361;当主题数量为40时,得到了 一致性值的最大值为0.442,与最低值相比,LDA主题建模一致性值上升22.44%.综上可以看出,研究的CRF机制结合LDA的病历文书后结构化系统具有较好的应用效果.

Abstract

In order to improve the accuracy of structured classification of medical records,the conditional random field(CRF)semi supervised dictionary segmentation algorithm and the implicit Dirichlet distribution(LDA)medical records text classifica-tion algorithm ard used to build a post structured system of medical records based on CRF mechanism and LDA.The results show that:when the number of topics is 40,the minimum perplexity of LDA topic modeling is-6.97,and compared with the initial perplexity,the perplexity of LDA topic modeling decreases by 9.76%.When the number of topics is 3,the minimum consistency value obtained is 0.361;when the number of topics is 40,the maximum consistency value obtained is 0.442,and compared with the minimum value,the increase in LDA topic modeling consistency value is 22.44%.In summary,it can be seen that the CRF mechanism combined with LDA's structured medical record system has good application effects.

关键词

条件随机场/半监督词典/隐式狄利克雷分布/病历文书/文本分类

Key words

conditional random field/semi supervised dictionary/implicit Dirichlet distribution/medical record document/text classification

引用本文复制引用

基金项目

梅州市社会发展科技计划(2022)(2022B22)

出版年

2024
微型电脑应用
上海市微型电脑应用学会

微型电脑应用

CSTPCD
影响因子:0.359
ISSN:1007-757X
参考文献量11
段落导航相关论文