Cathleen O'Grady
science.org
Originally posted 28 Aug 24
Here is an excerpt:
Creators of LLMs try to teach their models not to make racist stereotypes by training them using multiple rounds of human feedback. The team found that these efforts had been only partly successful: When asked what adjectives applied to Black people, some of the models said Black people were likely to be “loud” and “aggressive,” but those same models also said they were “passionate,” “brilliant,” and “imaginative.” Some models produced exclusively positive, nonstereotypical adjectives.
These findings show that training overt racism out of AI can’t counter the covert racism embedded within linguistic bias, King says, adding: “A lot of people don’t see linguistic prejudice as a form of covert racism … but all of the language models that we examined have this very strong covert racism against speakers of African American English.”
The findings highlight the dangers of using AI in the real world to perform tasks such as screening job candidates, says co-author Valentin Hofmann, a computational linguist at the Allen Institute for AI. The team found that the models associated AAE speakers with jobs such as “cook” and “guard” rather than “architect” or “astronaut.” And when fed details about hypothetical criminal trials and asked to decide whether a defendant was guilty or innocent, the models were more likely to recommend convicting speakers of AAE compared with speakers of Standardized American English. In a follow-up task, the models were more likely to sentence AAE speakers to death than to life imprisonment.
Here are some thoughts:
The article highlights that large language models (LLMs) perpetuate covert racism by associating African American English (AAE) speakers with negative stereotypes and less prestigious jobs, despite efforts to address overt racism. Linguistic prejudice is a subtle yet pervasive form of racism embedded in AI systems, highlighting the need for a more comprehensive approach to mitigate biases. The data used to train AI models contains biases and stereotypes, which are then perpetuated and amplified by the models. Measures to address overt racism may be insufficient, creating a "false sense of security" while embedding more covert stereotypes. As a result, AI models are not yet trustworthy for social decision-making, and their use in high-stakes applications like hiring or law enforcement poses significant risks.