Thematic Apperception Test for Suicide Risk Identification:An Audio and Text-Based Machine Learning Study
Suicide is not only a personal tragedy,but also has far-reaching effects.Identifying suicide risk is an important part of suicide prevention.Because traditional suicide risk screening methods based on self-report scales have a high rate of misreporting/underreporting,it is important to find an objective and effective identification tool.Although previous studies on the establishment of suicide risk identification models through audio data have yielded good results,the test materials used lacked theoretical support and were time-consuming.Besides,the lack of a standardized process made it difficult to collect large data to train a model that could be applied.Therefore,this study aims to adapt the widely used Thematic Apperception Test(TAT)by two steps.Firstly,adapting the test materials into an online test to build a model,and then developing a WeChat app to obtain high-quality audio data in a standardized process to build a suicide risk model.Study 1 began by adapting a standardized process for online administration of the TAT using the Tencent meetings.The audio of 64 subjects(High Risk Group:34;Low Risk Group:30)who completed the test was included in the analysis.After pre-processing,speech and text features were extracted for machine learning modeling,and four classifiers(SVM,LR,RF,KNN)were used to build the model.It was found that(1)Three pictures in the TAT test constructed the best performing classification models.Take Picture 5 in TAT for example,the LR model achieved an average ACC=.80 and an average AUC=.90.The best performing models were LR and SVM.(2)The analysis of narrative duration revealed that the subjects in the crisis group in this test generally had longer narrative durations.(3)Word frequency analysis of the full-length text using KH Coder found more words related to suicide,self-injury,and negative emotions mentioned in the narrative texts of the subjects in the crisis group,and more themes about suicide and self-injury in the narratives of the subjects in the crisis group were found through Keyword Co-occurrence Network analysis.The results of Study 1 confirm the feasibility of administering a TAT online and applying speech and text features to identify suicide risk,but the test is still time-consuming and requires a subject to administer it,so there may be experimenter bias.To further standardize the process,reduce the test time and enhance the convenience of the test,and thus improve the applicability of the adapted TAT,we further conducted Study 2.In this Study,a WeChat app was designed and implemented,and two images from Study 1(Figure 5 and Figure 10)were used as test materials and administered by the subjects themselves.A total of 58 subjects'audio was included in the analysis(High Risk Group:29;Low Risk Group:29).Four classifier models were selected for feature extraction and evaluated for effectiveness.The LR model trained with the data set extracted from the combined audio in Figure 5 and Figure 10 achieved the best results of all models in terms of ACC metrics(mean ACC=.83,mean AUC=.89).The results of the study suggest that modeling using audio data generated from a participant self-administered test can also yield satisfactory results.The constructed model achieved better modeling results with a better composite index compared to previous studies when the test took less time.The short administration time,ease of administration,and standardized procedure of the adapted TAT applet also facilitated the collection of more high-quality samples for the construction of a better generalized model to be used as an aid in the identification of suicide risk.