This study examines classification parity across sex and finds that female adolescents have systematically under-diagnosed mental health disorders: their model’s accuracy was ~4 % lower and false negative rate ~9 % higher compared to male patients. The source of the bias resides in the textual data, namely notes corresponding to male patients tended to be on average 500 words longer and had distinct word usage. To mitigate this, the authors introduce a de-biasing method, based on neutralizing biased terms (gendered words and pronouns) and reducing sentences to essential clinical information. After correcting, diagnostic bias is reduced by up to 27%.
This emphasizes how linguistically transmitted bias—ensuing from word choice and gendered language—consistently leads to the under-diagnosis of mental health disorders among female adolescents, which critically undermines the impartiality of medical diagnosis and treatment.
Learn more about this study here: https://doi.org/10.48550/arXiv.2501.00129
Reference
Ive, J., Bondaronek, P., Yadav, V., Santel, D., Glauser, T., Cheng, T., Strawn, J.R., Agasthya, G., Tschida, J., Choo, S., Chandrashekar, M., Kapadia, A.J., & Pestian, J.P. (2024). A Data-Centric Approach to Detecting and Mitigating Demographic Bias in Pediatric Mental Health Text: A Case Study in Anxiety Detection.
