检验医学2024,Vol.39Issue(12) :1190-1195.DOI:10.3969/j.issn.1673-8640.2024.12.010

利用机器学习算法初步构建基于常规检验项目的脑卒中复发预测模型

Stroke recurrence prediction model based on machine learning algorithms using routine blood test

沈展 卞晓波 黄莺 汪思阳 沈婷婷 张娴 宋云霄 谢连红
检验医学2024,Vol.39Issue(12) :1190-1195.DOI:10.3969/j.issn.1673-8640.2024.12.010

利用机器学习算法初步构建基于常规检验项目的脑卒中复发预测模型

Stroke recurrence prediction model based on machine learning algorithms using routine blood test

沈展 1卞晓波 2黄莺 1汪思阳 1沈婷婷 1张娴 1宋云霄 2谢连红1
扫码查看

作者信息

  • 1. 上海市徐汇区中心医院老年科,上海 200237
  • 2. 上海市徐汇区中心医院检验科,上海 200237
  • 折叠

摘要

目的 利用机器学习算法初步构建基于常规检验项目的脑卒中复发预测模型.方法 选取2010年1月—2023年12月上海市徐汇区中心医院脑卒中患者437例.对所有患者进行回顾性随访,随访期间再次发生脑卒中的患者纳入卒中复发组,未再次发生脑卒中的患者纳入卒中未复发组,按7∶3的比例随机分为训练集和验证集.检测所有患者初次发生脑卒中时的血脂和血常规.在训练集中采用5×交叉验证方法构建预测模型,机器学习算法包括随机森林(RF)算法、XGboost算法、Adaboost算法、K近邻(KNN)算法和Logistic回归(LR)算法.采用受试者工作特征(ROC)曲线和精确率-召回率曲线评估预测模型判断脑卒中复发的效能.结果 437例脑卒中患者的平均随访时间为6.2年,有184例患者再次发生脑卒中.在训练集中,卒中复发组红细胞(RBC)计数、血红蛋白(Hb)、红细胞平均体积(MCV)、淋巴细胞绝对数(LYMPH#)、总胆固醇(TC)和三酰甘油(TG)均高于卒中未复发组(P<0.05),其他指标2个组差异均无统计学意义(P>0.05).在验证集中,卒中复发组RBC计数、Hb、MCV、TC和TG均高于卒中未复发组(P<0.05),其他指标2个组差异均无统计学意义(P>0.05).在训练集中,XGboost算法判断脑卒中复发的ROC曲线的曲线下面积(AUC)和精确率-召回率曲线的曲线下面积(PRAUC)均高于RF算法、Adaboost算法、KNN算法和LR算法.在验证集中,XGboost算法构建的预测模型判断脑卒中复发的AUC为0.86,PRAUC为0.82.结论 基于血脂和血常规项目构建的脑卒中复发预测模型具有较好的临床应用价值.

Abstract

Objective To construct a prediction model for stroke recurrence based on machine learning algorithms using routine laboratory tests.Methods A total of 437 stroke patients admitted to Shanghai Xuhui District Central Hospital from January 2010 to December 2023 were retrospectively followed up.Patients with stroke recurrence during the follow-up period were classified as recurrence group,while those without stroke recurrence were classified as non-recurrence group.The dataset was randomly divided into a training set and a validation set in a 7∶3 ratio.Blood lipid and routine blood test parameters at the initial stroke occurrence were collected.A 5-fold cross-validation method was used to develop prediction model in the training set based on machine learning algorithms including random forest(RF),XGboost,Adaboost,K-nearest neighbors(KNN)and Logistic regression(LR).The predictive performance of stroke recurrence prediction model was evaluated using receiver operating characteristic(ROC)curves and precision-recall(PR)curves.Results The average follow-up duration for the 437 stroke patients was 6.2 years,which 184 patients experienced stroke recurrence.In the training set,red blood cell(RBC)count,hemoglobin(Hb),mean corpuscular volume(MCV),the absolute value of lymphocytes(LYMPH#),total cholesterol(TC)and triglyceride(TG)were higher in recurrence group than those in non-recurrence group(P<0.05).The other parameters showed no statistical significance(P>0.05).In the validation set,RBC count,Hb,MCV,TC and TG were higher in recurrence group(P<0.05),with no statistical significance observed in the other parameters(P>0.05).In the training set,the XGboost demonstrated superior performance in predicting stroke recurrence,with higher areas under curves(AUC)and the area under precision-recall curve(PRAUC)compared to RF,Adaboost,KNN and LR.In the validation set,the prediction model constructed using XGboost achieved an AUC of 0.86 and a PRAUC of 0.82.Conclusions The stroke recurrence prediction model based on blood lipid and routine blood test parameters demonstrates promising clinical application value.

关键词

血脂/血常规/机器学习/预测模型/脑卒中/复发

Key words

Blood Lipid/Routine blood test/Machine learning/Prediction model/Stroke/Recurrence

引用本文复制引用

出版年

2024
检验医学
上海市临床检验中心

检验医学

CSTPCD
影响因子:1.715
ISSN:1673-8640
段落导航相关论文