Optimization Exploration of Pulmonary Nodule Follow-up System Based on Big Data Platform
Hadoop is a widely recognized industry standard open source software for big data.Due to its massive data processing capabilities in distributed environments,it is currently widely used in lung nodule follow-up systems.However,the Hadoop distributed file system(HDFS)was originally designed to solve the problems of large file storage and computation,which resulted in low performance and high mem-ory usage of the main node NameNode for storing and retrieving a large number of small files.To this end,a HFS file storage scheme is con-structed by adding a file processing recognition module to NameNode to achieve the migration of small file metadata to the SecondnameNode and DataNode clusters;Simultaneously designing algorithms for data flow between DataNodes effectively reduces the processing pressure on NameNode nodes.The lung nodule follow-up system was tested based on HFS and a single HDFS,and the experimental results showed that the HFS based lung nodule follow-up system has significant advantages in terms of NameNode memory occupancy and overall data analysis time.