Year: 2023 Source: Humanities and Social Sciences Communications. (2023). 10, 895. https://doi.org/10.1057/s41599-023-02212-w SIEC No: 20232439
Suicide is a leading cause of death in the US. Online posts on social media can reveal valuable information about individuals with suicidal ideation and help prevent tragic outcomes. However, studying suicidality through online posts is challenging, as people may not be willing to share their thoughts directly due to various psychological and social barriers. Moreover, most of the previous studies focused on evaluating machine learning techniques to detect suicidal posts, rather than exploring the contextual features that are present in them. This study aimed to not only classify the posts based on sentiment analysis, but also to identify suicide-related psychiatric stressors, e.g., family problems or school stress, and examine the contextual features of the posts, especially those that are misclassified. We used two techniques of random forest and Lasso generalized linear models and found that they performed similarly. Our findings suggest that while machine learning algorithms can identify most of the potentially harmful posts, they can also introduce bias, and human intervention is needed to minimize that bias. We argue that some posts may be very difficult or impossible to tag correctly by algorithms alone, and they require human understanding and empathy.