首页|基于自然语言处理(NLP)的医学知识挖掘探索与实践

基于自然语言处理(NLP)的医学知识挖掘探索与实践

Exploration and Practice of Medical Knowledge Mining Based on Natural Language Processing(NLP)

扫码查看
目的 通过对医学健康知识的挖掘,为人工智能等的健康科普知识支撑提供实践经验.方法 采用基于自然语言处理(NLP)技术对徐汇区疾病预防控制中心2010年1月—2021年1月积累的科普文章进行结构拆分、阅读理解、实体识别等,处理流程包括文档预处理、特征提取、段落筛选、阅读理解、答案排序、审核和发布.结果 通过直接文档结构拆分,得到 5 395 条问答;通过阅读理解,得到 857条问答;通过抽取数字问答,得到 1 668条,初步形成问答形式的医学健康知识库.结论 自然语言处理(NLP)技术为人工智能技术需要的大量语料素材提供了有效制作方法.
Objective To mine the medical health knowledge,provide a practical experience to support health science popularization knowledge for artificial intelligence and other fields.Methods Based on Natural Language Processing(NLP)technology,the structure of popular science articles accumulated by Xuhui District Center for Disease Control and Prevention from January 2010 to January 2021 were split,read and understand,entity recognition,etc.The processing process included document pre-processing,feature extraction,paragraph screening,read and understand,answer sorting,review,and release.Results A total of 5 395 questions and answers were obtained through direct document structure splitting;by reading and understanding,857 questions and answers were obtained;by extracting digital Q&A,1 668 questions were obtained,forming a preliminary medical health knowledge base in the form of Q&A.Conclusion Natural Language Processing(NLP)technology provides an effective way to produce a large number of language materials for AI technology.

Natural language processingMedical knowledgeCorpusArtificial intelligence

沈红、崔子禕、曾淑君、金小蕾、盛妤、朱思燕、张莹、吴佳倩

展开 >

上海市徐汇区疾病预防控制中心,上海, 200237

上海智臻智能网络科技股份有限公司,上海,201803

自然语言处理 医学知识 语料 人工智能

上海市徐汇区医学科研项目徐汇区公共卫生体系建设三年行动计划(2023-2025年)

SHXH201959

2024

健康教育与健康促进
上海市健康促进中心

健康教育与健康促进

ISSN:1673-6192
年,卷(期):2024.19(2)
  • 7