首页|基于双随机森林的发热待查智能诊断方法

基于双随机森林的发热待查智能诊断方法

扫码查看
在机器学习预测模型中,不平衡数据集会降低少数类的预测准确性.针对发热待查数据集的不平衡特性,该文提出了一种基于K-Means聚类欠采样的双随机森林病因预测方法.首先通过K-Means聚类欠采样构建一个平衡数据集,并在此基础上创建一个基于CART投票机制的随机森林预测模型.然后对初始数据集用同样的方法创建一个随机森林预测模型.最后将两个随机森林预测模型联合,使用两者的CART一起投票预测.该文提出的方法增加了CART的数量,在保持原有数据集特性的同时,提高了少数类的投票权重.在发热待查数据集上的实验表明,该文所提方法不仅改善了少数类的预测性能,对其他类别的预测性能也有一定程度的提升.
An Intelligent Diagnosis Method for FUO Based on Bi-random Forest
In machine learning prediction models,imbalanced datasets reduce the accuracy of minority class predictions.A bi-random forest etiology prediction method based on K-Means clustering undersampling is proposed to address the imbalanced characteristics of the fever of unknown origin(FUO)dataset.Firstly,a balanced dataset is constructed through K-Means clustering undersampling,and a random forest prediction model based on the CART voting mechanism is created on this basis.Then,a random forest prediction model is also created using the same method for the initial dataset.Finally,two random forest prediction models are combined and their CART are used to vote together for prediction.The proposed method increases the number of CART,and enhances the voting weights of minority class while maintaining the characteristics of the original dataset.Experiments on FUO dataset show that the proposed method not only improves the prediction performance for minority class,but also improves the prediction performance for the other classes to a certain extent.

Intelligent DiagnosisMachine LearningFever of Unknown OriginRandom ForestImbalanced Dataset

杜建超、丁俊瑶、赵梦楠、连建奇、陈天艳、WU Yuan、周云、石磊

展开 >

西安电子科技大学通信工程学院,陕西西安 710071

空军军医大学第二附属医院,陕西西安 710038

西安交通大学第一附属医院,陕西西安 710061

Duke University Health System,Durham NC 27710

展开 >

智能诊断 机器学习 发热待查 随机森林 不平衡数据集

空军军医大学第二附属医院前沿交叉研究项目

2021QYJC-005

2024

生物医学工程学进展
上海市生物医学工程学会

生物医学工程学进展

影响因子:0.504
ISSN:1674-1242
年,卷(期):2024.45(3)