基于大数据平台的肺部结节随访系统优化探索

扫码查看

原文链接

万方数据
维普

中文摘要：Hadoop是公认的行业大数据标准开源软件,因其在分布式环境下具备海量数据处理能力,目前在肺部结节随访系统中应用广泛.然而,Hadoop分布式文件系统(HDFS)在设计之初是为了解决大文件存储与计算问题,对海量数目的小文件存储与检索存在性能低下、主节点NameNode内存占用率高等问题.为此构建一种改进的HDFS数据布局存储方案HFS,通过在NameNode中加入文件处理识别模块实现小文件元数据向SecondnameNode和DataNode集群的迁移;同时设计出DataNode间数据流动的算法,有效降低了NameNode节点的处理压力.分别基于HFS和单一HDFS对肺部结节随访系统进行测试,实验结果表明在NameNode内存占有率和整体数据分析时间等方面,基于HFS的肺部结节随访系统具备明显优势.

外文标题：Optimization Exploration of Pulmonary Nodule Follow-up System Based on Big Data Platform

外文摘要：Hadoop is a widely recognized industry standard open source software for big data.Due to its massive data processing capabilities in distributed environments,it is currently widely used in lung nodule follow-up systems.However,the Hadoop distributed file system(HDFS)was originally designed to solve the problems of large file storage and computation,which resulted in low performance and high mem-ory usage of the main node NameNode for storing and retrieving a large number of small files.To this end,a HFS file storage scheme is con-structed by adding a file processing recognition module to NameNode to achieve the migration of small file metadata to the SecondnameNode and DataNode clusters;Simultaneously designing algorithms for data flow between DataNodes effectively reduces the processing pressure on NameNode nodes.The lung nodule follow-up system was tested based on HFS and a single HDFS,and the experimental results showed that the HFS based lung nodule follow-up system has significant advantages in terms of NameNode memory occupancy and overall data analysis time.

外文关键词：

HFSHadooplung nodule follow-up systembig data

作者：

张国华、徐建军

展开 >

作者单位：

南京师范大学泰州学院,江苏泰州 225300

关键词：

HFS Hadoop 肺部结节随访系统大数据

基金：

国家自然科学基金青年基金项目江苏省高校自然科学研究面上项目南京师范大学泰州学院教学改革研究项目

项目编号：

5170826519KJD5200082023JG12021

出版年：

2024

DOI：

10.11907/rjdk.231259

软件导刊

湖北省信息学会

软件导刊

影响因子：0.524

ISSN：1672-7800

年,卷(期)：2024.23(4)

参考文献量22