Volume 39 Number 6 | December 2025
Summary
Artificial intelligence has advanced healthcare, but biased data and flawed algorithms can create inequities in diagnosis and treatment. Studies show AI tools can reinforce racial, gender, and testing-rate disparities, especially in laboratory medicine. This article urges careful development, monitoring, and transparent use of AI to ensure accuracy, fairness, and equitable patient care.
Deborah Blecker-Shelly MS, MLS(ASCP)SMCM, DLMCM, Patient Safety and Diagnostic Stewardship Committee Member

In the Jetson era, medical advances forecasted the future: video medical appointments (check) heart tele-monitors (check) and “pill/capsule” video endoscopy (check). The Merriam-Webster Dictionary defines artificial intelligence (AI) as “the capability of computer systems and/or algorithms to imitate intelligent human behavior.” For about 10 years now, AI has been a part of our daily lives—think Alexa, Google, ChatGPT, facial recognition, etc.
In medicine, the use of AI, computer programming, and other data-driven technologies has led to great advances in healthcare. These tools can establish algorithms to refine predictions, standardize processes, and guide clinical decisions; all with the goal of providing better patient care and improving health outcomes.
However, there is a significant risk—the powerful tool of AI must be developed with great care and requires appropriate management of limitations, parameters, and labels in order to be useful, equitable, and unbiased. The development of an AI model life cycle can include many biases, with common ones across each phase, such as implicit bias, selection bias, sampling bias, validation bias, and evaluation bias.
“The bottom line is that healthcare organizations using AI to predict illness and treatment must ensure ample monitoring to eliminate bias and achieve equitable performance.”
Let’s take a look at a few examples where the use of AI led to bias and inequity in diagnostic medicine.
Racial Bias in Medical AI
Obermeyer et al. describe evidence of racial bias following use of an algorithm, such that black patients assigned the same level of clinical risk by the algorithm were sicker than white patients. This racial bias reduced the number of black patients flagged for extra health benefits by 50 percent. Less money was spent on black patients who had the same level of need, resulting in an algorithm falsely concluding that black patients were healthier than equally sick white patients. The reason the bias occurs is that the algorithm uses health costs as a proxy for health needs, so reformulating the algorithm to eliminate cost as a proxy for needs removes the racial bias in predicting who needs extra care.
A study published in PLOS Global Public Health shows that emergency department tests are 4.5 percent higher for white patients than for black patients of the same age, sex, and with the same medical complaints and emergency department triage score. Researchers expressed concern that if AI models were built using this data to guide clinician decision-making, they could reinforce pre-existing testing biases and result in substandard care for black patients. For example, clinicians might assume that black patients are less likely to get ill, when in fact they are less likely to be tested or admitted to the hospital, which is not necessarily the same thing. Under-tested subgroups of patients can lead to misrepresentation in AI models during development.
Additional Bias in Laboratory AI
Laboratory AI bias also exists in the clinical laboratory, where testing rates are a widespread source of bias in AI models for healthcare settings. Many AI models developed to predict clinical outcomes, such as sepsis, depend on laboratory test results, with untested patients often assumed to have normal results. This assumption is typically operationalized by assigning untested patients a negative label during model training. However, if laboratory testing rates differ across races, AI models may perform disparately across racial subgroups. Such models could inappropriately underestimate risk for patients in racial groups less likely to receive laboratory tests, potentially amplifying inequities in clinical care.
In a 2022 study from the United Kingdom, researchers examined state of the art AI approaches used by hospitals worldwide and found a 70 percent success rate in predicting liver disease from blood tests; however, they uncovered a wide gender gap, with 44 percent of cases in women missed compared with 23 percent of cases among men. This is one of the first reports published regarding how bias relates to AI-generated blood test diagnosis.
The bottom line is that healthcare organizations using AI to predict illness and treatment must ensure ample monitoring to eliminate bias and achieve equitable performance. Organizations must be very selective when choosing AI tools, considering the disclosure of operating characteristics, biases, and recommended uses, and exercise caution when applying them across the organization’s entire patient population.
In the evolving landscape of healthcare delivery—one increasingly influenced by AI technology—recognizing and mitigating bias is a priority. While essential for achieving accuracy and reliability in AI innovations, addressing bias is central in upholding the ethical standards of healthcare, ensuring a future where care delivery is fair and equitable.
In conclusion, the highly recommended series, The Jetsons, available on YouTube, is a show we never thought would come to life, yet it has—with artificial intelligence and futuristic developments. In the words of George Jetson, “Jane! Stop this crazy thing!”
References
- “Artificial intelligence.” Merriam-Webster.com Dictionary, Merriam-Webster, https://www.merriam-webster.com/dictionary/artificial%20intelligence. Accessed 30 Sep. 2025.
- https://www.frontiersin.org/journals/digital-health/articles/10.3389/fdgth.2025.1492736/full
- International Journal of Life Sciences, Biotechnology and Pharma Research Vol. 13, No. 8, August 2024 Online ISSN: 2250-3137 Print ISSN: 2977-0122 DOI: 10.69605/ijlbpr_13.8.2024.66 383 ©2024
- Chang T, Nuppnau M, He Y, Kocher KE, Valley TS, Sjoding MW, Wiens J. Racial differences in laboratory testing as a potential mechanism for bias in AI: A matched cohort analysis in emergency department visits. PLOS Glob Public Health. 2024 Oct 30; 4(10):e0003555. doi: 10.1371/journal.pgph.0003555. PMID: 39475953; PMCID: PMC11524489.
- Chang T, Nuppnau M, He Y, Kocher KE, Valley TS, Sjoding MW, Wiens J. Racial differences in laboratory testing as a potential mechanism for bias in AI: A matched cohort analysis in emergency department visits. PLOS Glob Public Health. 2024 Oct 30; 4(10):e0003555. doi: 10.1371/journal.pgph.0003555. PMID: 39475953; PMCID: PMC11524489.
- Alsulimani A, Akhter N, Jameela F, Ashgar RI, Jawed A, Hassani MA, Dar SA. The Impact of Artificial Intelligence on Microbial Diagnosis. Microorganisms. 2024 May 23;12 (6):1051. doi: 10.3390/microorganisms12061051. PMID: 38930432; PMCID: PMC11205376.
Deborah Blecker-Shelly is the Laboratory Manager of Microbiology and Molecular Diagnostics at Capital Health in Pennington, New Jersey.