Prospective prediction of suicide attempts in a community adolescents and young adults, using regression methods and machine learning
Miche, M., Studerus, E., Meyer, A.H., Gloster, A.T., Beesdo-Baum, K., Wittchen, H-U., & Lieb, R.
The use of machine learning (ML) algorithms to study suicidality has recently been recommended. Our aim was to explore whether ML approaches have the potential to improve the prediction of suicide attempt (SA) risk. Using the epidemiological multiwave prospective-longitudinal Early Developmental Stages of Psychopathology (EDSP) data set, we compared four algorithms-logistic regression, lasso, ridge, and random forest-in predicting a future SA in a community sample of adolescents and young adults.
The EDSP Study prospectively assessed, over the course of 10 years, adolescents and young adults aged 14-24 years at baseline. Of 3021 subjects, 2797 were eligible for prospective analyses because they participated in at least one of the three follow-up assessments. Sixteen baseline predictors, all selected a priori from the literature, were used to predict follow-up SAs. Model performance was assessed using repeated nested 10-fold cross-validation. As the main measure of predictive performance we used the area under the curve (AUC).
The mean AUCs of the four predictive models, logistic regression, lasso, ridge, and random forest, were 0.828, 0.826, 0.829, and 0.824, respectively.
Based on our comparison, each algorithm performed equally well in distinguishing between a future SA case and a non-SA case in community adolescents and young adults. When choosing an algorithm, different considerations, however, such as ease of implementation, might in some instances lead to one algorithm being prioritized over another. Further research and replication studies are required in this regard.