Digital Phenotypes for Mental Health using AI

Advised by: Daniel Rubin, MD, MS

Professor of Biomedical Data Science, Radiology, and Medicine (Biomedical Informatics)

and (by courtesy) Computer Science and Ophthalmology,

Stanford University

Table of contents

Digital Phenotypes for Mental Health using AI
1. Project 1: Identifying subtypes of neuro-psychiatric disorders through linguistic phenotypes
2. Project 2: Suicidality prediction from large-scale social media postings

Clinical interactions with mental health patients provide only a small window into their condition over time, resulting in missed detection or intervention opportunities. Current methods for risk prediction using traditional approaches show accuracy estimates of just 50%! Instead, we use computational analysis of narrative texts using NLP and other AI techniques to help study psychological disorders at a large scale.

Project 1: Identifying subtypes of neuro-psychiatric disorders through linguistic phenotypes

Collaborators: Scott Fleming, Shaimaa Bakr, Imon Banerjee, Daniel Rubin

A complete understanding of neuro-psychiatric disorders, particularly their subtypes and phenotypic differences, is necessary for developing good diagnostic tests. The primary method of diagnosis in a conventional psychological setting is to map observable symptoms from a few interviews to those lineated in the DSM-5 (Diagnostic and Statistical Manual of Mental Disorders). Unfortunately, there are studies showing that mental health is much more nuanced and diverse than the current understanding captured in DSM-5. Persisting in assigning the “best fitting” DSM-5 category to a patient often leads to misdiagnosis and failed treatments.

Interestingly, we see an overlap of DSM-5 criteria with social media, particularly subreddits. In this work, we analyse the possibility of extracting depression subtypes based on unstructured natural language from large-scale social media postings.

subreddit dsm5 comorbidities Figure 1: Comorbidities in Subreddits vs DSM-5

Collaborators: Isha Rajput, Daniel Rubin

There has been an observable lack of progress in predicting suicide risk among vulnerable populations. Studies have shown that suicidality cannot be predicted effectively using the standard practice of clinicians asking people in person about suicidal thoughts: 80% of patients who were not already undergoing psychiatric treatment and who died of suicide reported as not having suicidal thoughts when asked by their GP. Social media, on the other hand, provides an apt lens into the patient’s mental state away from the clinic.

The challenge with this line of work is the lack of training labels. People have resorted to using self-reported cases or clues from the text as labels or have got psychologists to label a small subset of the data. In this body of work, we develop methods to train a hierarchical LSTM model to identify suicide risk from large scale unlabelled social media (Reddit) datasets by using weak supervision techniques on a small, well-curated data subset, hand-labelled by expert clinicians. Moreover, we featurize unconventional yet important, time-related clinical risk indicators such as difficulty sleeping, engagement, withdrawal, etc. to boost the performance of our model.

Digital Phenotypes for Mental Health using AI

Project 1: Identifying subtypes of neuro-psychiatric disorders through linguistic phenotypes

Project 2: Suicidality prediction from large-scale social media postings