Assessing Algorithmic Bias in Language-Based Depression Detection: A Comparison of DNN and LLM Approaches

A study found that large language models (LLMs) outperform traditional deep neural network (DNN) embeddings in automated depression detection and show reduced gender bias, through racial disparities remain. Among DNN fairness-mitigation techniques, the worst-group loss provided the best balance between overall accuracy and demographic fairness, while fairness-regularized loss underperformed.

The identified biases affect the fairness and diagnostic reliability of AI systems for mental health assessment, particularly by disadvantaging underrepresented racial and gender groups, mainly Hispanic participants in the case of this research. Such disparities risk perpetuating inequities in automated mental health screening and could undermine trust and validity in clinical or public health applications.

Learn more about the study here: https://doi.org/10.48550/arXiv.2509.25795


Reference

Junias, O., Kini, P., & Chaspari, T. (2025). Assessing Algorithmic Bias in Language-Based Depression Detection: A Comparison of DNN and LLM Approaches. 2025 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), 1-7.