Tag: Deployment Bias

  • Racial bias in AI-mediated psychiatric diagnosis and treatment: a qualitative comparison of four large language models

    Racial bias in AI-mediated psychiatric diagnosis and treatment: a qualitative comparison of four large language models

    The article investigates racial bias in psychiatric diagnosis and treatment recommendations across four large language models (LLMs): Claude, ChatGPT, Gemini, and NewMes-15. ​ The study evaluates the models’ responses to ten psychiatric cases representing five diagnoses (depression, anxiety, schizophrenia, eating disorders, and ADHD) under three conditions: race-neutral, race-implied, and race-explicitly stated (African American). ​

    Key findings include:

    1) Bias in Treatment Recommendations: LLMs often proposed inferior or divergent treatments when racial characteristics were explicitly or implicitly indicated, particularly for schizophrenia and anxiety cases. ​ Diagnostic decisions showed minimal bias overall. ​

    2) Model Performance: NewMes-15 exhibited the highest degree of racial bias, while Gemini demonstrated the least bias across conditions. ​

    3) Statistical Analysis: A Kruskal–Wallis H-test revealed significant differences in bias among the LLMs, with Gemini being significantly less biased than ChatGPT and NewMes-15. ​

    4) Challenges in AI Development: The study highlights that LLMs trained on biased datasets may perpetuate racial disparities in psychiatric care, even when specialized medical training data is used. ​ Local LLMs, despite their cost and privacy advantages, showed higher susceptibility to bias compared to larger, online models. ​

    Learn more about this study here: https://doi.org/10.1038/s41746-025-01746-4


    Reference

    Bouguettaya, A., Stuart, E.M. & Aboujaoude, E. Racial bias in AI-mediated psychiatric diagnosis and treatment: a qualitative comparison of four large language models. npj Digit. Med. 8, 332 (2025).