Year: 2022 Source: Suicide and Life-Threatening Behavior. (2022), 52(4), 782-791. SIEC No: 20221144

To improve the accuracy of classification of deaths of undetermined intent and to examine racial differences in misclassification.
We used natural language processing and statistical text analysis on restricted-access case narratives of suicides, homicides, and undetermined deaths in 37 states collected from the National Violent Death Reporting System (NVDRS) (2017). We fit separate race-specific classification models to predict suicide among undetermined cases using data from known homicide cases (true negatives) and known suicide cases (true positives).
A classifier trained on an all-race dataset predicts less than half of these cases as suicide. Importantly, our analysis yields an estimated suicide rate for the Black population comparable with the typical detection rate for the White population, indicating that misclassification excess is endemic for Black suicide. This problem may be mitigated by using race-specific data. Our findings, based on the statistical text analysis, also reveal systematic differences in the phrases identified as most predictive of suicide.
This study highlights the need to understand the reasons underlying suicide rate differences and for further testing of strategies to reduce misclassification, particularly among people of color.