中华行为医学与脑科学杂志2023,Vol.32Issue(9) :787-793.DOI:10.3760/cma.j.cn371468-20230420-00183

大学生自伤行为预测模型的建立及影响因素分析

The development of a predictive model of self-injurious behavior and the influencing factors among college students

成楠 廖润超 张琳钰 刘艳丽 车佳郡 李晓敏 刘海宁 冯学泉
中华行为医学与脑科学杂志2023,Vol.32Issue(9) :787-793.DOI:10.3760/cma.j.cn371468-20230420-00183

大学生自伤行为预测模型的建立及影响因素分析

The development of a predictive model of self-injurious behavior and the influencing factors among college students

成楠 1廖润超 2张琳钰 1刘艳丽 2车佳郡 1李晓敏 1刘海宁 1冯学泉
扫码查看

作者信息

  • 1. 1承德医学院心理学系,承德 067000
  • 2. 2承德医学院生物医学工程系,承德 067000
  • 折叠

摘要

目的 采用机器学习算法建立大学生自伤行为预测模型并探究大学生自伤行为的高风险因素。 方法 2022年11—12月,采用便利抽样选取791名河北某高校在校大学生,以是否发生自伤行为作为结果变量,纳入基本人口学资料进行统计,并进行青少年自我伤害问卷、习得性无助量表、人际需求问卷中文版、青少年生活事件量表和童年创伤经历问卷评估,运用SPSS 26.0对预测变量进行统计分析,通过随机森林、支持向量机、逻辑回归三种机器学习模型预测大学生自伤行为,以模型的准确率、F1分数、敏感度、特异性以及AUC值评估模型性能,选出最优模型,利用最优模型分析大学生自伤行为的高风险因素。 结果 (1)单因素分析结果显示,大学生自伤行为检出率为42.4%(335/791),男生的检出率显著高于女生(χ2=14.139,P<0.05);家庭中低月收入(3 000~5 999元)的个体自伤行为检出率显著高于其他家庭月收入个体(P<0.05)。(2)随机森林、支持向量机和逻辑回归模型的准确率依次为85.53%,85.96%,68.86%,F1分数依次为0.853,0.864,0.676;敏感度依次为83.91%,89.04%,64.91%;AUC值依次为0.92,0.89和0.73。(3)基于预测效能较优的随机森林算法分析大学生自伤行为的高风险因素前10位的特征变量依次为情感虐待、归属受挫、无助感、人际关系因子、绝望感、情感忽视、学习压力因子、家庭月收入、累赘感知以及健康适应因子。 结论 随机森林相较于支持向量机和逻辑回归对于预测大学生自伤行为的效果更优;影响大学生自伤的因素主要来源于环境因素、个体因素和人际因素。 Objective A machine learning algorithm was used to develop a predictive model of self-injury among college students and to explore the high-risk factors for self-injury among college students. Methods From November to December 2022, a convenience sample of 791 college students from a university in Hebei Province was selected.Whether the self-injurious behavior occurred or not was regarded as an outcome variable.The basic demographics data were collected for statistical analysis.The adolescent self-harm questionnaire, the acquired helplessness scale, the Chinese version of the interpersonal needs questionnaire, the adolescent life events scale, and the childhood traumatic experiences questionnaire were used for assessment.The predictor variables were statistically analyzed by SPSS 26.0 software, and the performance of the model was evaluated by random forest, support vector machine and logistic regression so as to predict the self-injury behavior of college students.The model performance was evaluated by the accuracy, F1 score, sensitivity, specificity, and AUC value of the model, and the optimal model was selected.Finally, the optimal model was used to analyze the high-risk factors of college students' self-injury behaviors. Results (1) The results of one-way ANOVA showed that the detection rate of self-injury behavior among college students was 42.4%(335/791), and the detection rate of male students was significantly higher than that of female students (χ2=14.139, P<0.05). Individuals with lower-middle monthly household income(RMB 3 000-5 999) had a significantly higher detection rate of self-injury behavior than those with other monthly household income(P<0.05). (2) The accuracy of random forest, support vector machine, and logistic regression models were 85.53%, 85.96%, and 68.86%, F1 scores were 0.853, 0.864, and 0.676, and sensitivities were 83.91%, 89.04%, and 64.91%, respectively.The AUCs of support vector machine, logistic regression models and random forest were 0.89, 0.73 and 0.92.(3) The top ten characteristic variables of high risk factors for college students' self-injury behaviors based on the random forest algorithm with better predictive efficacy were emotional abuse, frustration of belonging, helplessness, interpersonal relationship factor, despair, emotional neglect, academic stress factor, monthly family income, perception of tiredness, and health adaptation factor, in that order. Conclusions Random forest is optimal for predicting self-injury behavior among college students compared to support vector machine and logistic regression.Factors influencing self-injury behavior among college students originate from environmental factors, individual factors and interpersonal factors.

Abstract

Objective A machine learning algorithm was used to develop a predictive model of self-injury among college students and to explore the high-risk factors for self-injury among college students. Methods From November to December 2022, a convenience sample of 791 college students from a university in Hebei Province was selected.Whether the self-injurious behavior occurred or not was regarded as an outcome variable.The basic demographics data were collected for statistical analysis.The adolescent self-harm questionnaire, the acquired helplessness scale, the Chinese version of the interpersonal needs questionnaire, the adolescent life events scale, and the childhood traumatic experiences questionnaire were used for assessment.The predictor variables were statistically analyzed by SPSS 26.0 software, and the performance of the model was evaluated by random forest, support vector machine and logistic regression so as to predict the self-injury behavior of college students.The model performance was evaluated by the accuracy, F1 score, sensitivity, specificity, and AUC value of the model, and the optimal model was selected.Finally, the optimal model was used to analyze the high-risk factors of college students' self-injury behaviors. Results (1) The results of one-way ANOVA showed that the detection rate of self-injury behavior among college students was 42.4%(335/791), and the detection rate of male students was significantly higher than that of female students (χ2=14.139, P<0.05). Individuals with lower-middle monthly household income(RMB 3 000-5 999) had a significantly higher detection rate of self-injury behavior than those with other monthly household income(P<0.05). (2) The accuracy of random forest, support vector machine, and logistic regression models were 85.53%, 85.96%, and 68.86%, F1 scores were 0.853, 0.864, and 0.676, and sensitivities were 83.91%, 89.04%, and 64.91%, respectively.The AUCs of support vector machine, logistic regression models and random forest were 0.89, 0.73 and 0.92.(3) The top ten characteristic variables of high risk factors for college students' self-injury behaviors based on the random forest algorithm with better predictive efficacy were emotional abuse, frustration of belonging, helplessness, interpersonal relationship factor, despair, emotional neglect, academic stress factor, monthly family income, perception of tiredness, and health adaptation factor, in that order. Conclusions Random forest is optimal for predicting self-injury behavior among college students compared to support vector machine and logistic regression.Factors influencing self-injury behavior among college students originate from environmental factors, individual factors and interpersonal factors.

关键词

自伤行为/机器学习/随机森林/预测模型/大学生

Key words

Self-injury/Machine-learning/Random forest/Prediction model/College students

引用本文复制引用

基金项目

河北省教育科学规划课题(十四五)(2203222)

河北省高层次人才项目(B20221019)

承德医学院大学生创新创业训练计划(2023010)

出版年

2023
中华行为医学与脑科学杂志
中华医学会 济宁医学院

中华行为医学与脑科学杂志

CSTPCDCSCD北大核心
影响因子:1.472
ISSN:1674-6554
参考文献量12
段落导航相关论文