Predictive modelling of deliberate self‑harm and suicide attempts in young people accessing primary care: A machine learning analysis of a longitudinal study
McHugh, C.M., Ho, N., Iorfino, F., Crouse, J.J., Nichles, A., Zmicerevska, N., ... & Hickie, I.B.
Purpose Machine learning (ML) has shown promise in modelling future self-harm but is yet to be applied to key questions facing clinical services. In a cohort of young people accessing primary mental health care, this study aimed to establish (1) the performance of models predicting deliberate self-harm (DSH) compared to suicide attempt (SA), (2) the performance of models predicting new-onset or repeat behaviour, and (3) the relative importance of factors predicting these outcomes. Methods 802 young people aged 12–25 years attending primary mental health services had detailed social and clinical assessments at baseline and 509 completed 12-month follow-up. Four ML algorithms, as well as logistic regression, were applied to build four distinct models. Results The mean performance of models predicting SA (AUC: 0.82) performed better than the models predicting DSH (AUC: 0.72), with mean positive predictive values (PPV) approximately twice that of the prevalence (SA prevalence 14%, PPV: 0.32, DSH prevalence 22%, PPV: 0.40). All ML models outperformed standard logistic regression. The most frequently selected variable in both models was a history of DSH via cutting. Conclusion History of DSH and clinical symptoms of common mental disorders, rather than social and demographic factors, were the most important variables in modelling future behaviour. The performance of models predicting outcomes in key sub-cohorts, those with new-onset or repetition of DSH or SA during follow-up, was poor. These findings may indicate that the performance of models of future DSH or SA may depend on knowledge of the individual’s recent history of either behaviour.