Simultaneous Diagnosis of Multiple Mental Disorders:K Nearest Neighbor Multi-Label Learning Based on Label Correlation
With the rapid development of technology,economy and society,peo-ple's quality of life has been greatly improved.However,the fast pace of work and life also brings tremendous stress.Stress,anxiety,and worrying life events,coupled with chemical imbalances,can bring about a variety of psychological disorders,such as depression,bipolar disorder,and anxiety disorder,in individual patients.To the best of our knowledge,traditional machine learning algorithms identify only one of these psychological disorders for each patient and do not achieve a comprehensive diagnosis of the major psychological disorders suffered by the patient.However,psy-chological problems often involve multiple dependent disorders at the same time.To address these problems,this paper proposes a solution that utilizes a machine learn-ing approach and an improved multi-label K nearest neighbor(ML-KNN)model:A multi-label learning algorithm based on label correlation.The algorithm first mines the correlation between multiple labels of a sample to obtain the frequent item set of labels by FP_growth algorithm.Then it constructs a scoring model and a threshold model for the frequent item set and labels.The former is used to measure the degree of correlation between the samples and the frequent itemsets or labels,and the latter is used to solve the discriminative thresholds corresponding to the frequent itemsets or labels,and combined to predict the frequent itemsets of the samples.Second,consid-ering label correlation,Gaussian weights are also introduced in this paper to quantify the distance between different instances.Finally,in order to solve the problem of possible empty labels in model prediction,this paper also relies on the traditional K nearest neighbors(KNN)algorithm for secondary prediction of empty labeled data,which further improves the prediction accuracy.The algorithm is first compared with KNN on the public dataset reuters,scene and emotions to verify the effectiveness of the algorithm.Experiments are conducted on the clinical dataset,which show that the algorithm achieves 63.57%,72.73%,73.17%and 70.04%for F1_score,accuracy,α accuracy and Rate index,respectively.Finally,compared with the experimental results of each benchmark method on this dataset,the average performance improve-ment is 2.8%,which proves that the algorithm reasonably utilizes the correlation between labels and Gaussian weights to quantify the distance between instances.