Ethics and Psychology: AI technology

Showing posts with label AI technology. Show all posts

Friday, August 8, 2025

Explicitly unbiased large language models still form biased associations

Bai, X., Wang, A., et al. (2025).

PNAS, 122(8).

Abstract

Large language models (LLMs) can pass explicit social bias tests but still harbor implicit biases, similar to humans who endorse egalitarian beliefs yet exhibit subtle biases. Measuring such implicit biases can be a challenge: As LLMs become increasingly proprietary, it may not be possible to access their embeddings and apply existing bias measures; furthermore, implicit biases are primarily a concern if they affect the actual decisions that these systems make. We address both challenges by introducing two measures: LLM Word Association Test, a prompt-based method for revealing implicit bias; and LLM Relative Decision Test, a strategy to detect subtle discrimination in contextual decisions. Both measures are based on psychological research: LLM Word Association Test adapts the Implicit Association Test, widely used to study the automatic associations between concepts held in human minds; and LLM Relative Decision Test operationalizes psychological results indicating that relative evaluations between two candidates, not absolute evaluations assessing each independently, are more diagnostic of implicit biases. Using these measures, we found pervasive stereotype biases mirroring those in society in 8 value-aligned models across 4 social categories (race, gender, religion, health) in 21 stereotypes (such as race and criminality, race and weapons, gender and science, age and negativity). These prompt-based measures draw from psychology’s long history of research into measuring stereotypes based on purely observable behavior; they expose nuanced biases in proprietary value-aligned LLMs that appear unbiased according to standard benchmarks.

Significance

Modern large language models (LLMs) are designed to align with human values. They can appear unbiased on standard benchmarks, but we find that they still show widespread stereotype biases on two psychology-inspired measures. These measures allow us to measure biases in LLMs based on just their behavior, which is necessary as these models have become increasingly proprietary. We found pervasive stereotype biases mirroring those in society in 8 value-aligned models across 4 social categories (race, gender, religion, health) in 21 stereotypes (such as race and criminality, race and weapons, gender and science, age and negativity), also demonstrating sizable effects on discriminatory decisions. Given the growing use of these models, biases in their behavior can have significant consequences for human societies.

Here are some thoughts:

This research is important to psychologists because it highlights the parallels between implicit biases in humans and those that persist in large language models (LLMs), even when these models are explicitly aligned to be unbiased. By adapting psychological tools like the Implicit Association Test (IAT) and focusing on relative decision-making tasks, the study uncovers pervasive stereotype biases in LLMs across social categories such as race, gender, religion, and health—mirroring well-documented human biases. This insight is critical for psychologists studying bias formation, transmission, and mitigation, as it suggests that similar cognitive mechanisms might underlie both human and machine biases. Moreover, the findings raise ethical concerns about how these biases might influence real-world decisions made or supported by LLMs, emphasizing the need for continued scrutiny and development of more robust alignment techniques. The research also opens new avenues for understanding how biases evolve in artificial systems, offering a unique lens through which psychologists can explore the dynamics of stereotyping and discrimination in both human and machine contexts.

Thursday, April 3, 2025

Large Language Models and User Trust: Consequence of Self-Referential Learning Loop and the Deskilling of Health Care Professionals

Choudhury, A., & Chaudhry, Z. (2024).

Journal of medical Internet research, 26, e56764.

https://doi.org/10.2196/56764

Abstract

As the health care industry increasingly embraces large language models (LLMs), understanding the consequence of this integration becomes crucial for maximizing benefits while mitigating potential pitfalls. This paper explores the evolving relationship among clinician trust in LLMs, the transition of data sources from predominantly human-generated to artificial intelligence (AI)–generated content, and the subsequent impact on the performance of LLMs and clinician competence. One of the primary concerns identified in this paper is the LLMs’ self-referential learning loops, where AI-generated content feeds into the learning algorithms, threatening the diversity of the data pool, potentially entrenching biases, and reducing the efficacy of LLMs. While theoretical at this stage, this feedback loop poses a significant challenge as the integration of LLMs in health care deepens, emphasizing the need for proactive dialogue and strategic measures to ensure the safe and effective use of LLM technology. Another key takeaway from our investigation is the role of user expertise and the necessity for a discerning approach to trusting and validating LLM outputs. The paper highlights how expert users, particularly clinicians, can leverage LLMs to enhance productivity by off-loading routine tasks while maintaining a critical oversight to identify and correct potential inaccuracies in AI-generated content. This balance of trust and skepticism is vital for ensuring that LLMs augment rather than undermine the quality of patient care. We also discuss the risks associated with the deskilling of health care professionals. Frequent reliance on LLMs for critical tasks could result in a decline in health care providers’ diagnostic and thinking skills, particularly affecting the training and development of future professionals. The legal and ethical considerations surrounding the deployment of LLMs in health care are also examined. We discuss the medicolegal challenges, including liability in cases of erroneous diagnoses or treatment advice generated by LLMs. The paper references recent legislative efforts, such as The Algorithmic Accountability Act of 2023, as crucial steps toward establishing a framework for the ethical and responsible use of AI-based technologies in health care. In conclusion, this paper advocates for a strategic approach to integrating LLMs into health care. By emphasizing the importance of maintaining clinician expertise, fostering critical engagement with LLM outputs, and navigating the legal and ethical landscape, we can ensure that LLMs serve as valuable tools in enhancing patient care and supporting health care professionals. This approach addresses the immediate challenges posed by integrating LLMs and sets a foundation for their maintainable and responsible use in the future.

The abstract provides a sufficient summary.

Ethics and Psychology

Welcome to the Nexus of Ethics, Psychology, Morality, Philosophy and Health Care

Friday, August 8, 2025

Explicitly unbiased large language models still form biased associations

Thursday, April 3, 2025

Large Language Models and User Trust: Consequence of Self-Referential Learning Loop and the Deskilling of Health Care Professionals

Get posts by email:

Welcome to the Nexus of Ethics, Psychology, Morality, Philosophy and Health Care

Friday, August 8, 2025

Explicitly unbiased large language models still form biased associations

Thursday, April 3, 2025

Large Language Models and User Trust: Consequence of Self-Referential Learning Loop and the Deskilling of Health Care Professionals

Subscribe To