同时诊断多种精神障碍:基于标签相关性的K近邻多标签学习

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：随着技术、经济和社会的快速发展,人们的生活质量得到了极大的提高.然而,工作和生活的快节奏也带来了巨大的压力.紧张、焦虑和担忧等生活困扰,及其带来的体内化学物质失衡,会给患者带来多种精神障碍问题,如抑郁症、躁郁症和焦虑症.精神问题往往同时涉及多种相关的精神障碍.为了解决这些问题,文章提出了一种利用机器学习方法和改进的多标签K近邻(ML-KNN)模型的解决方案:基于标签相关性的多标签学习算法.该算法首先利用FP_growth算法挖掘样本的多个标签之间的相关性,获得标签的频繁项集.然后构建频繁项集和标签的评分模型和阈值模型.前者用于衡量样本与频繁项集或标签之间的相关程度,后者用于解决与频繁项集或标签相对应的判别阈值,并结合起来预测样本的频繁项集.其次,考虑到标签相关性,还引入高斯权重来量化不同实例之间的距离.最后,为了解决模型预测中可能出现的空标签的问题,还依靠传统的K近邻(KNN)算法对空标签数据进行二次预测,进一步提高了预测精度.该算法首先在公开数据集reuters、scene和emotions上与KNN进行比较,以验证算法的有效性.然后在临床数据集上进行实验,结果显示该算法的F1分数、准确度、α准确度和Rate指数分别达到了 63.57％、72.73％、73.17％和70.04％.最后,与该数据集上的每种基准方法的实验结果进行比较,平均性能提升为2.8％,证明该算法合理利用了标签之间的相关性和高斯权重对实例之间的距离进行量化.

外文标题：Simultaneous Diagnosis of Multiple Mental Disorders:K Nearest Neighbor Multi-Label Learning Based on Label Correlation

外文摘要：With the rapid development of technology,economy and society,peo-ple's quality of life has been greatly improved.However,the fast pace of work and life also brings tremendous stress.Stress,anxiety,and worrying life events,coupled with chemical imbalances,can bring about a variety of psychological disorders,such as depression,bipolar disorder,and anxiety disorder,in individual patients.To the best of our knowledge,traditional machine learning algorithms identify only one of these psychological disorders for each patient and do not achieve a comprehensive diagnosis of the major psychological disorders suffered by the patient.However,psy-chological problems often involve multiple dependent disorders at the same time.To address these problems,this paper proposes a solution that utilizes a machine learn-ing approach and an improved multi-label K nearest neighbor(ML-KNN)model:A multi-label learning algorithm based on label correlation.The algorithm first mines the correlation between multiple labels of a sample to obtain the frequent item set of labels by FP_growth algorithm.Then it constructs a scoring model and a threshold model for the frequent item set and labels.The former is used to measure the degree of correlation between the samples and the frequent itemsets or labels,and the latter is used to solve the discriminative thresholds corresponding to the frequent itemsets or labels,and combined to predict the frequent itemsets of the samples.Second,consid-ering label correlation,Gaussian weights are also introduced in this paper to quantify the distance between different instances.Finally,in order to solve the problem of possible empty labels in model prediction,this paper also relies on the traditional K nearest neighbors(KNN)algorithm for secondary prediction of empty labeled data,which further improves the prediction accuracy.The algorithm is first compared with KNN on the public dataset reuters,scene and emotions to verify the effectiveness of the algorithm.Experiments are conducted on the clinical dataset,which show that the algorithm achieves 63.57％,72.73％,73.17％and 70.04％for F1_score,accuracy,α accuracy and Rate index,respectively.Finally,compared with the experimental results of each benchmark method on this dataset,the average performance improve-ment is 2.8％,which proves that the algorithm reasonably utilizes the correlation between labels and Gaussian weights to quantify the distance between instances.

外文关键词：

Mental illnessmulti-label learningK nearest neighborGaussian weight-inglabel correlation

作者：

王甜甜、张曦林、薛闯、卫国、谢小良

展开 >

作者单位：

湖南工商大学理学院,长沙 410205

统计学习与智能计算湖南省重点实验室,长沙 410205

中国科学技术大学数学科学学院,合肥 230026

浙江大学医学院附属精神卫生中心(杭州市第七人民医院),杭州 310000

北卡罗来纳大学彭布罗克分校数学与计算机系,北卡罗莱来州28372

展开 >

关键词：

精神疾病多标签学习 K近邻高斯权重标签相关性

基金：

国家社科基金重点项目湖南省科技创新项目

项目编号：

22ATJ008CX20221185

出版年：

2024

DOI：

10.12341/jssms23639

系统科学与数学

中国科学院数学与系统科学研究院

系统科学与数学

CSTPCD北大核心

影响因子：0.425

ISSN：1000-0577

年,卷(期)：2024.44(9)

参考文献量1