情报工程2024,Vol.10Issue(4) :3-13.DOI:10.3772/j.issn.2095-915x.2024.04.001

基于多源数据的疾病知识图谱构建研究

A Research on the Construction of Disease Knowledge Graph Based on Multi-source Data

孙丝雨 侯跃芳 丁敬达 梅佳月 孙佳
情报工程2024,Vol.10Issue(4) :3-13.DOI:10.3772/j.issn.2095-915x.2024.04.001

基于多源数据的疾病知识图谱构建研究

A Research on the Construction of Disease Knowledge Graph Based on Multi-source Data

孙丝雨 1侯跃芳 2丁敬达 1梅佳月 2孙佳2
扫码查看

作者信息

  • 1. 上海大学文化遗产与信息管理学院 上海 200444
  • 2. 中国医科大学健康管理学院 沈阳 110122
  • 折叠

摘要

[目的/意义]基于PubMed、OMIM等医学数据库中的多源数据设计疾病知识图谱构建方案,为疾病的生物学实验研究及诊断治疗提供参考和依据.[方法/过程]首先利用语义分析工具SemRep抽取SPO三元组,通过实体对齐、关系映射等数据处理方法进行知识融合,然后利用Neo4j图数据库实现知识存储及可视化展示,以多囊卵巢综合征为例进行实证检验和分析,最终获得61589个SPO三元组、34697个实体和27种语义关系并归纳总结7种语义模式.[局限]数据处理时,涉及人工审查,但由于数据量较大,审查过程中可能存在些许误差.[结果/结论]本研究改进现有的知识融合方法,验证了该疾病知识图谱构建方案的可行性.为后续基于疾病知识图谱进行医学领域知识发现探索奠定基础.

Abstract

[Objective/Significance]Based on the multi-source data in PubMed,OMIM and other medical databases,the construction scheme of disease knowledge graph is designed to provide reference and basis for biological experimental research,diagnosis and treatment of diseases.[Methods/Processes]Firstly,SPO triples are extracted by SemRep,and knowledge fusion is carried out by data processing methods such as entity alignment and relationship mapping.Then,knowledge storage and visual display are realized by Neo4j graph database.Taking polycystic ovary syndrome as an example,61589 SPO triples,34697 entities and 27 semantic relationships are finally obtained,and 7 semantic patterns are summarized.[Limitations]In the process of data processing,manual examination is involved,but due to the large amount of data,there may be some errors in the examination process.[Results/Conclusions]This study improves the existing knowledge fusion method and verifies the feasibility of the disease knowledge graph construction scheme.It lays a foundation for the follow-up exploration of knowledge discovery in medical field based on disease knowledge graph.

关键词

疾病知识图谱/SPO三元组/知识融合/语义分析

Key words

Disease Knowledge Graph/Subject-Predication-Object/Knowledge Fusion/Semantic Analysis

引用本文复制引用

基金项目

辽宁省教育厅科学研究经费项目(人文社科类基础研究项目)(JCRW2020005)

出版年

2024
情报工程

情报工程

CSTPCDCHSSCD
ISSN:
段落导航相关论文