首页|基于Hadoop和MPP数据库混合架构的大数据集成平台

基于Hadoop和MPP数据库混合架构的大数据集成平台

The Integration Platform of Big Data Based on Mixed Architecture of Hadoop and MPP Database

扫码查看
面对海量离散、多源异构的健康医疗大数据,传统集成平台架构存在处理数据量级小、效率低、灵活性差、对非结构化数据的存储分析困难等问题,构建基于Hadoop和MPP数据库混合架构的健康医疗大数据集成平台.综合运用两种架构的技术优势,利用MPP关系型架构执行处理结构化数据的复杂查询、多表关联、自助分析等逻辑加工场景,利用Hadoop分布式架构完成大规模非结构化数据的并行计算.该集成平台采用逻辑分层和物理分区的建设策略,实现了健康医疗大数据的集中采集、分类存储、有效整合,保证数据的治理质量和处理效率,为临床和科研工作提供高效的数据支撑平台.
Faced with the increasing volumes of discrete and multi-source heterogeneous healthcare big data,traditional integration platform architectures are challenged by their limited data processing capacity,low efficiency,poor flexibility,and difficulties in storing and analyzing unstructured data.To address these issues,a method for constructing a healthcare big data integration platform is pro-posed,which utilizes a hybrid architecture based on Hadoop and Missively Parallel Processing(MPP)databases.This approach combines the technical advantages of both architectures.The MPP relational architecture is utilized for performing logical processing scenarios involving complex queries,multi-table associations,and self-service analysis of structured data.On the other hand,the Hadoop distrib-uted architecture is employed for parallel computation of large-scale unstructured data.For the con-struction of this integrated platform,a strategy is followed that involves logical stratification and physi-cal partitioning to achieve centralized collection,classified storage,and effective integration of health and medical big data.This ensures high-quality governance over the data while improving processing efficiency,providing an efficient support platform for clinical practice as well as scientific research.

health medical big dataHadoopMPP databasemixed architectureintegrated plat-form

张艳姣、任晓阳

展开 >

郑州大学第一附属医院临床医学大数据中心,河南 郑州 450052

健康医疗大数据 Hadoop架构 MPP数据库 混合架构 集成平台

2024

信息工程大学学报
中国人民解放军信息工程大学科研部

信息工程大学学报

影响因子:0.276
ISSN:1671-0673
年,卷(期):2024.25(4)
  • 9