Tag: AEQUITAS

Systematic review and meta-analysis of performance of wearable artificial intelligence in detecting and predicting depression

The systematic review and meta-analysis found that wearable AI systems demonstrate promising performance in detecting and predicting depression. However, substantial variability exists among algorithms and devices, thereby indicating that performance can vary significantly.

What this means is that disparities across different algorithms and devices were identified, suggesting that certain demographic groups may be underrepresented or inadequately served by current wearable AI systems. This variability underscores the need for further research to enhance the generalizability and fairness of these technologies in clinical practice.

Learn more about this review here: https://doi.org/10.1038/s41746 -023-00828-5

Reference

Abd-Alrazaq, A., AlSaad, R., Shuweihdi, F. et al. Systematic review and meta-analysis of performance of wearable artificial intelligence in detecting and predicting depression. npj Digit. Med. 6, 84 (2023).
Digital health tools for the passive monitoring of depression: a systematic review of methods

This systematic review examines studies linking passive data from smartphones and wearables to depression, identifying key methodological flaws and threats to reproducibility. It highlights biases such as representation, measurement, and evaluation bias, stemming from small, homogenous samples and inconsistent feature construction.

Although gender and race are not explicitly discussed, the lack of diversity in study populations suggests potential demographic bias. The review calls for improved reporting standards and broader sample inclusion to enhance generalizability and clinical relevance. These improvements are essential for ensuring that digital mental health tools are equitable and reliable across diverse populations.

Learn more about this review here: https://doi.org/10.1038/s41746-021-00548-8

Reference

De Angel, V., Lewis, S., White, K., Oetzmann, C., Leightley, D., Oprea, E., Lavelle, G., Matcham, F., Pace, A., Mohr, D. C., Dobson, R., & Hotopf, M. (2022). Digital health tools for the passive monitoring of depression: a systematic review of methods. NPJ digital medicine, 5(1), 3.
Bias Discovery in Machine Learning Models for Mental Health

This article examined how AI can unintentionally reproduce social and demographic biases when applied to mental health prediction. Using benzodiazepine prescriptions as a proxy for conditions such as depression and anxiety, a study analyzed machine learning models trained on patient data to identify systematic disparities.

It found that women are more frequently predicted to receive such treatments, reflecting gender bias, while the models perform less accurately for minority ethnic groups, indicating representation and evaluation bias. The AI models here are not used to prescribe drugs but rather to predict treatment likelihoods, revealing how bias in healthcare data can lead to inequitable AI performance in the context of depression-related care.

Learn more about the article here: https://doi.org/10.3390/info13050237

Reference

Mosteiro, P.J., Kuiper, J., Masthoff, J., Scheepers, F., & Spruit, M. (2022). Bias Discovery in Machine Learning Models for Mental Health. Inf., 13, 237.
Assessing Algorithmic Bias in Language-Based Depression Detection: A Comparison of DNN and LLM Approaches

A study found that large language models (LLMs) outperform traditional deep neural network (DNN) embeddings in automated depression detection and show reduced gender bias, through racial disparities remain. Among DNN fairness-mitigation techniques, the worst-group loss provided the best balance between overall accuracy and demographic fairness, while fairness-regularized loss underperformed.

The identified biases affect the fairness and diagnostic reliability of AI systems for mental health assessment, particularly by disadvantaging underrepresented racial and gender groups, mainly Hispanic participants in the case of this research. Such disparities risk perpetuating inequities in automated mental health screening and could undermine trust and validity in clinical or public health applications.

Learn more about the study here: https://doi.org/10.48550/arXiv.2509.25795

Reference

Junias, O., Kini, P., & Chaspari, T. (2025). Assessing Algorithmic Bias in Language-Based Depression Detection: A Comparison of DNN and LLM Approaches. 2025 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), 1-7.
Developing personalized algorithms for sensing mental health symptoms in daily life

This study investigates algorithmic bias in AI tools that predict depression risk using smartphone-sensed behavioral data.

It finds that these tools underperform in larger, more diverse populations because the behavioral patterns used to predict depression are inconsistent across demographic and socioeconomic subgroups.

Specifically, the AI models often misclassify individuals from certain groups—such as older adults or those from different racial or gender backgrounds—as being at lower risk than they actually are. The authors emphasize the need for tailored, subgroup-aware approaches to improve reliability and fairness in mental health prediction tools. This work highlights the importance of addressing demographic bias to ensure equitable AI deployment in mental healthcare.

Learn more about this study here: https://doi.org/10.1038/s44184-025-00147-5

Reference

Timmons, A.C., Tutul, A.A., Avramidis, K. et al. Developing personalized algorithms for sensing mental health symptoms in daily life. npj Mental Health Res 4, 34 (2025).
Racial bias in AI-mediated psychiatric diagnosis and treatment: a qualitative comparison of four large language models

The article investigates racial bias in psychiatric diagnosis and treatment recommendations across four large language models (LLMs): Claude, ChatGPT, Gemini, and NewMes-15. The study evaluates the models’ responses to ten psychiatric cases representing five diagnoses (depression, anxiety, schizophrenia, eating disorders, and ADHD) under three conditions: race-neutral, race-implied, and race-explicitly stated (African American).

Key findings include:

1) Bias in Treatment Recommendations: LLMs often proposed inferior or divergent treatments when racial characteristics were explicitly or implicitly indicated, particularly for schizophrenia and anxiety cases. Diagnostic decisions showed minimal bias overall.

2) Model Performance: NewMes-15 exhibited the highest degree of racial bias, while Gemini demonstrated the least bias across conditions.

3) Statistical Analysis: A Kruskal–Wallis H-test revealed significant differences in bias among the LLMs, with Gemini being significantly less biased than ChatGPT and NewMes-15.

4) Challenges in AI Development: The study highlights that LLMs trained on biased datasets may perpetuate racial disparities in psychiatric care, even when specialized medical training data is used. Local LLMs, despite their cost and privacy advantages, showed higher susceptibility to bias compared to larger, online models.

Learn more about this study here: https://doi.org/10.1038/s41746-025-01746-4

Reference

Bouguettaya, A., Stuart, E.M. & Aboujaoude, E. Racial bias in AI-mediated psychiatric diagnosis and treatment: a qualitative comparison of four large language models. npj Digit. Med. 8, 332 (2025).
Domain Adversarial Training for Mitigating Gender Bias in Speech-based Mental Health Detection

A domain adversarial training (DAT) was developed in a study as a method to reduce gender bias in AI models for depression and PTSD detection using speech data (E-DAIC dataset).

DAT improved F1-scores up to +13% and reduced gender gaps in detection accuracy, improving generalization across male and female participants, specially addressing the effects of the latter’s underrepresentation.

Learn more about this study here: https://doi.org/10.48550/arXiv.2505.03359

Reference

Kim, J., Yoon, H., Oh, W., Jung, D., Yoon, S., Kim, D., Lee, D., Lee, S., & Yang, C. (2025). Domain Adversarial Training for Mitigating Gender Bias in Speech-based Mental Health Detection. 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 1-7.
Minding the Gaps: Neuroethics, AI, and Depression

In this article, the author highlights the benefits and potential issues regarding the use of AI in depression diagnosis/treatment, focusing on the prevalent gender, racial and ethnicity biases.

It is mentioned that, given the historical, inherent biases in society generally and healthcare specifically, AI-driven advancements are not going to serve minority groups as a matter of course. Unless they are tailored to represent and serve all communities equally, they will exacerbate existing biases and disparities.

Learn more about this article here: https://nonprofitquarterly.org/minding-the-gaps-neuroethics-ai-and-depression/

Reference

Boothroyd, Gemma (2024), “Minding the Gaps: Neuroethics, AI, and Depression”, in Nonprofit Quarterly Magazine, winter 2024, “Health Justice in the Digital Age: Can We Harness AI for Good?”
Bias and Fairness in AI-Based Mental Health Models

The paper examines bias and fairness issues in AI-based mental health applications, including diagnostic tools, chatbots, and suicide risk prediction models. It reports how unrepresentative datasets lead to misdiagnosis and unequal outcomes across different socioeconomic, gender and racial groups – namely concerning women, local ethnic minorities or non-Western societies -, and presents mitigation strategies such as diverse datasets, fairness metrics, and human-in-the-loop approaches.

Learn more about this paper here: https://www.researchgate.net/publication/389214235_Bias_and_Fairness_in_AI-Based_Mental_Health_Models

Reference

Barnty, Barnabas & Joseph, Oloyede & Ok, Emmanuel. (2025). Bias and Fairness in AI-Based Mental Health Models.
AI and Mental Healthcare – ethical and regulatory considerations

This governmental report discusses the ethical and regulatory considerations of using artificial intelligence in mental healthcare in the UK.

Bias in AI tools (algorithmic bias) can stem from various places, including tools being trained on biased datasets and outputting discriminatory outcomes or developers making biased decisions in the design or training of such tools. For example, mental health Electronic health record (EHR) data is susceptible to cohort and label bias. This can occur because culture-bound presentations of mental disorders, combined with a lack of transcultural literacy among clinicians, often lead to both over- and under-diagnosis. People can also exhibit bias when using AI tools, such as over-relying on, or mistrusting AI outputs. All these biases can be conscious or unconscious.

Learn more about the report here: https://doi.org/10.58248/PN738

Reference

Gardiner, Hannah and Natasha Mutebi (2025), AI and Mental Healthcare – ethical and regulatory considerations, UK Parliament – POST, POSTnote 738, 31 January 2025