For years, artificial intelligence has been touted as a potential game-changer for healthcare in the United States. Over a decade since the HITECH Act incentivized hospital systems to use electronic health records (EHR) for patient data management, there has been an explosion in the amount of healthcare data generated, stored, and available to drive insights and clinical decision-making.
The motivation to integrate AI into mental health services has grown during the pandemic. The Kaiser Family Foundation reported an increase in adults experiencing symptoms of anxiety and depression, from 1 in 10 adults pre-pandemic to 4 in 10 adults in early 2021. Coupled with a national shortage of mental health professionals as well as limited opportunities for in-person mental health support, AI-powered tools could be used as an entry point to care by automatically and remotely measuring and intervening to reduce mental health symptoms.
Many mental health startups are integrating AI within their product offerings. Woebot Health developed a chatbot that delivers on-demand therapy to users through natural language processing (NLP). Spring Health leverages machine learning powered by patients’ historical data to drive personalized treatment recommendations. Large technology companies are also beginning to dive into this space: Apple recently partnered with UCLA to develop algorithms that measure symptoms of depression using data collected on Apple devices.
Yet we’ve also seen that AI is far from perfect. There have been notable bumps on the road in other areas of medicine that are telling about the limitations of AI and, in particular, the machine learning models that power its decision-making. For example, Epic, one of the largest EHR software developers in the United States, deployed a sepsis prediction tool across hundreds of hospitals. Researchers found that the tool performed poorly across many of these hospital systems. A widely-used algorithm used to refer people to “high-risk care management” programs was less likely to refer black people than white people who were equally sick. As mental health AI products are launched, technologists and clinicians need to learn from past failures of AI tools in order to create more effective interventions and limit potential harms.
Our recent research describes three areas where AI-powered mental health technologies may underperform in use.
- Understand individuals: First, it may be difficult for AI mental health measurement tools to contextualize the different ways individuals experience mental health changes. For example, some individuals sleep more when they experience a depressive episode, while others sleep less, and AI tools may not be able to understand these differences without additional human interpretation.
- Adapt over time: Second, AI technologies need to adapt to patients’ continued needs as they evolve. For example, during the COVID-19 pandemic, we were forced to adapt to new personal and professional norms. Similarly, AI-driven mental health measurement tools need to adapt to new behavioral routines, and treatment tools need to offer a new suite of options to accommodate users’ changing priorities.
- Collecting uniform data: Third, AI tools may work differently across devices due to different data access policies created by device manufacturers. For example, many researchers and companies are developing AI mental health measures using data collected from technologies like smartphones. Apple does not allow developers to collect many data types available on Android, and many studies have created and validated AI mental health measures with exclusively Android devices.
Knowing these focus areas, we researched if a smartphone-based AI-tool could measure mental health across individuals experiencing different mental health symptoms, using different devices. While the tool was fairly accurate, the different symptoms and data types collected across devices limited what our tool could measure compared with tools evaluated on more homogenous populations. As these systems are deployed across larger and more diverse populations, it will be more difficult to support different users’ needs.
Given these limitations, how do we responsibly develop AI tools that improve mental healthcare? As an overall mindset, technologists should not assume that AI tools will perform well when deployed, but instead continuously work with stakeholders to reevaluate solutions as they underperform or are misaligned with stakeholders’ needs.
For one, we should not assume that technology solutions are always welcomed. History proves this; it’s well established that the introduction of EHRs increased provider burnout and are notoriously difficult to use. Similarly, we need to understand how AI mental health technologies may affect different stakeholders within the mental healthcare system.
For example, AI-powered therapy chatbots may be an adequate solution for patients experiencing mild mental health symptoms, but patients experiencing more severe symptoms will require additional support. How do we enable this hand-off from a chatbot to a care provider? As another example, continuous measurement tools may provide a remote and less arduous method to measure patients’ mental health. But who should be allowed to see these measures, and when should they be made available? Clinicians, already overburdened and experiencing data overload, may not have time to review this data outside of the appointment. Simultaneously, patients may feel that data collection and sharing violates their privacy.
Organizations deploying AI mental health technologies need to understand these complexities to be successful. By working with stakeholders to identify the different ways AI tools interface with and impact people giving and receiving care, the more likely technologists will build solutions that improve patient mental health.
Dan Adler is a PhD student at Cornell Tech, where he works in the People-Aware Computing Lab building technology to improve mental health and wellbeing.