首页|Hadoop平台下基于朴素贝叶斯算法的心脏疾病预测方案

Hadoop平台下基于朴素贝叶斯算法的心脏疾病预测方案

扫码查看
应用分布式存储、计算技术和朴素贝叶斯算法,构建心脏病预测模型方案.首先搭建Hadoop完全分布式平台,结合Python语言和MapReduce编程框架构建朴素贝叶斯分类器.此外,用MapReduce实现对算法并行化以提升分析效率.以2020美国CDC数据集作为心脏病数据集,在测试集中算法准确率达到88.52%,且经验证该方案能够在实际使用中成功预测是否患心脏病.此方案准确率较高且具备高可扩展性、分布式存储和计算、容错性等优势,形成了一种可靠、高效和低成本的解决方案.
Heart disease prediction solution based on Naive Bayes algorithm under Hadoop platform
We have developed a heart disease prediction model using distributed storage,computing technology,and the naive Bayes algorithm.Firstly,we built a Hadoop fully distributed platform and combined it with the Python language and MapReduce programming framework to construct the naive Bayes classifier.Additionally,we parallelized the algorithm using MapReduce to im-prove analysis efficiency.Using the 2020 US CDC dataset as the heart disease dataset,the accuracy of the algorithm reached 88.52%in the test set.Furthermore,our solution has been validated to be able to successfully predict whether an individual is suf-fering from heart disease in practical applications.This solution has high accuracy and advantages such as high scalability,distrib-uted storage and computing,and fault tolerance,forming a reliable,efficient,and low-cost solution.

HadoopMapReducedata miningNaive Bayes

王自强、尚志会、石永华、王毅、符萱、杨生正

展开 >

遵义医科大学医学信息工程学院,遵义 563000

Hadoop MapReduce 数据挖掘 朴素贝叶斯

贵州省教育厅高校人文社会科学研究项目贵州省卫生健康委科学技术基金项目大学生创新创业训练计划项目贵州省科技计划项目贵州省高等学校教学内容与课程体系改革项目遵义市科技计划项目遵义医科大学2021年度学术新苗培养及创新探索专项项目

23RWJD162gzwkj2022-524ZYDC202301099黔科平台人才[2020]-030SJJG2022-02-172遵市科合HZ字[2023]191号黔科平台人才[2021]1350-027

2024

现代计算机
中大控股

现代计算机

影响因子:0.292
ISSN:1007-1423
年,卷(期):2024.30(11)