伴右向左分流隐源性卒中患者发病风险预测模型研究

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：目的利用机器学习预测右向左分流(right-to-left shunt,RLS)人群隐源性卒中(crypto-genic stroke,CS)发病风险,为CS的准确和高效预测提供解决方案.方法回顾分析2018年1月至2023年9月在青岛大学附属医院崂山院区神经内科治疗的经颅多普勒超声发泡试验(c-TCD)阳性的289例RLS人群的临床数据,包括人口统计学信息、疾病史、实验室检查指标、诊断和治疗等.使用机器学习train_test_split()函数将数据集随机分为训练集和测试集,比例为8∶2.采用Logistic回归、决策树、随机森林、极端梯度提升、人工神经网络、梯度提升、极限树和自适应增强等算法构建RLS人群CS风险预测模型,使用受试者工作特征曲线(receiver operating characteristic,ROC)及曲线下面积(ar-ea under curve,AUC)、混淆矩阵、精确率、召回率、准确率、F1值、校准曲线、决策曲线等综合评估模型性能.性能最优的模型使用特征重要性和SHAP值进行可解释性分析.使用SPSS 25.0进行t检验、Mann-Whitney U检验和x2检验.采用Delong检验比较两模型间AUC的差异.结果 289例RLS人群发生CS 166例(57.5％),非CS 123例(42.5％).统计分析结果显示,CS患者D-二聚体、平均血小板体积、纤维蛋白原等血液生化指标高于非CS患者(均P＜0.01);训练集与测试集各变量均差异无统计学意义(均P＞0.05).对测试集进行CS风险预测,随机森林模型取得了最高的AUC(0.885)、精确率(0.806)、召回率(0.879)、准确率(0.810)以及F1得分(0.841).校准曲线显示随机森林模型最接近参考线,决策曲线表明随机森林模型具有更大的净受益.可解释性分析显示高风险因素包括平均血小板体积、D-二聚体、国际标准化比值、体质量指数以及年龄.结论基于随机森林的预测工具表现出色,在预测RLS人群CS风险方面准确性较高.

外文标题：A study on the risk prediction model for cryptogenic stroke in patients with right-to-left shunt

外文摘要：Objective To predict the risk of cryptogenic stroke(CS)patients with right-to-left shunt(RLS)by machine learning,and provide potential solutions for accurate and efficient prediction of CS.Methods A retrospective analysis of clinical data on 289 subjects with positive RLS detected by contrast-enhanced transcranial Doppler tests(c-TCD)treated in the Department of Neurology at Laoshan Campus,the Affiliated Hospital of Qingdao University,from January 2018 to September 2023,including demographic infor-mation,medical history,laboratory test indicators,diagnosis,and treatment.The dataset was randomly divided into a training set and a testing set by the machine learning function train_test_split(),with a ratio of 8∶2.Risk prediction models for CS in RLS subjects were constructed by algorithms such as Logistic regression,de-cision trees,random forests,extreme gradient boosting,artificial neural networks,gradient boosting,extra trees,and adaptive Boosting.The model performance was evaluated by receiver operating characteristic curves(ROC),area under curve(AUC),confusion matrix,precision,recall,accuracy,F1 score,calibration curves,and decision curve analysis.The optimal model was subjected to interpretability analysis by feature impor-tance and SHAP values.The t-test,Mann-Whitney U test and x2 test were used for data analysis by SPSS 25.0 software.Delong test was used to compare the differences in AUC between the two models.Results In 289 RLS subjects,there were 166 cases of CS(57.5％)and 123 cases of non-CS(42.5％).The statistical analysis results showed that blood biochemical indicators such as D-dimer,mean platelet volume,and fibrino-gen in CS patients were higher than those in non-CS patients(all P＜0.01).There were no statistically signif-icant differences in variables between the training and testing sets(all P＞0.05).Random forest model a-chieved the highest AUC(0.885),precision(0.806),recall(0.879),accuracy(0.810),and F1 score(0.841)for CS risk prediction in the testing set.The calibration curve showed that the random forest model was closest to the reference line,and the decision curve analysis indicated that it had a greater net benefit.The interpretability analysis revealed that high-risk factors included mean platelet volume,D-dimer,interna-tional normalized ratio,body mass index,and age.Conclusion The random forest-based prediction tool ex-hibits excellent performance,demonstrating high accuracy in predicting CS risk in RLS population.

外文关键词：

Cryptogenic strokeRight-to-left shuntMachine learningPredictive modelRandom forest model

作者：

唐素娟、吴庆文、李玲儿、李道静、赵洪芹

展开 >

作者单位：

青岛大学附属医院神经内科,青岛 266035

济宁医学院附属医院重症医学科,济宁 272030

济宁医学院附属医院数据中心,济宁 272030

济宁医学院附属医院神经内科,济宁 272030

展开 >

关键词：

隐源性卒中右向左分流机器学习预测模型随机森林模型

基金：

国家自然科学基金济宁市重点研发计划软科学项目

项目编号：

819012282019SMNS002

出版年：

2024

DOI：

10.3760/cma.j.cn371468-20231202-00281

中华行为医学与脑科学杂志

中华医学会济宁医学院

中华行为医学与脑科学杂志

CSTPCD北大核心

影响因子：1.472

ISSN：1674-6554

年,卷(期)：2024.33(6)