Large Language Models Demonstrate Pronounced Bias Against African American English

Large Language Models Demonstrate Pronounced Bias Against African American English

Large Language Models Demonstrate Pronounced Bias Against African American English


### The Enduring Prejudices in AI: An In-Depth Examination of Language Models and African American English

Artificial Intelligence (AI) has achieved significant advancements in recent years, with large language models (LLMs) like GPT-3.5 and GPT-4 demonstrating remarkable abilities in comprehending and generating text that resembles human language. Nonetheless, as these models advance, concerns regarding their inherent biases—particularly those tied to race—have emerged. Despite attempts to address these biases, recent investigations show that LLMs continue to retain profound prejudices, especially toward speakers of African American English (AAE).

#### The Development of AI and Bias

AI-driven chatbots have often reflected societal biases in troubling manners. A prominent instance is Microsoft’s Tay, an AI chatbot launched in 2016, which had to be shut down after it began producing racist and offensive content. Since that incident, AI researchers have diligently worked to resolve such problematic behaviors, utilizing techniques like reinforcement learning with human feedback (RLHF). These initiatives have led to the creation of more sophisticated models such as GPT-3.5 and GPT-4, which, when prompted directly, now connect African Americans to positive traits like “resilience” and “creativity.”

Yet, the crucial question persists: Have these biases been genuinely eliminated, or are they simply being concealed?

#### Revealing Underlying Biases: The Significance of African American English

To investigate this, scholars from U.S. institutions conducted a study focusing on the African American English sociolect (AAE), a language variant that developed from the era of slavery in the United States. AAE transcends a mere dialect; it serves as a linguistic identifier that frequently indicates the racial identity of the speaker without direct reference.

The researchers created pairs of phrases, one in standard American English and the other in AAE, and instructed various LLMs to link terms with the speakers of the respective phrases. The outcomes were troubling. Across all examined models—GPT-2, RoBERTa, T5, GPT-3.5, and even GPT-4—the words associated with AAE speakers were predominantly negative. Terms such as “dirty,” “stupid,” “rude,” “ignorant,” and “lazy” recurred, with minor variations between models. Even the most cutting-edge model, GPT-4, generated descriptors like “suspicious,” “aggressive,” “loud,” “rude,” and “ignorant.”

These results echo the early Princeton Trilogy studies from the 1930s, wherein Princeton University students similarly associated African Americans with negative stereotypes. The researchers concluded that LLMs demonstrate “outdated stereotypes about speakers of AAE that most closely align with the most-negative human stereotypes about African Americans ever experimentally documented, dating back to before the civil rights movement.”

#### Real-World Consequences: Bias in Decision-Making

The ongoing existence of such biases in LLMs has substantial real-world ramifications. AI is being increasingly utilized in diverse decision-making contexts, including job screenings and legal rulings. For example, some organizations employ AI to evaluate the social media activities of job candidates, potentially encompassing AAE usage. If the AI links AAE with negative characteristics, it could unjustly sway hiring outcomes.

To examine this, the researchers executed experiments where LLMs were provided samples of standard American English and AAE and tasked with proposing suitable jobs for the speakers. The findings were illuminating. For standard American English, the models recommended high-status jobs demanding significant education, such as professor, astronaut, and psychiatrist. Conversely, the suggested jobs for AAE speakers were frequently of lower status, such as cook and guard. Even when higher-status positions were proposed, they typically resided in fields like athletics or the performing arts, which do not necessitate the same level of formal education.

The researchers also recreated a legal trial scenario where the principal evidence consisted of a paragraph written in either standard American English or AAE. The results exhibited bias in conviction rates, with AAE speakers facing more frequent convictions and receiving harsher sentences, including a greater likelihood of the death penalty.

#### The Continuous Struggle Against Bias

The study’s outcomes reveal a disturbing truth: while overt racism might be less acceptable in modern society, implicit biases persist both in the broader community and in AI technologies. The researchers propose that this reflects the U.S.’s complicated relationship with race, where blatant expressions of racism have diminished, yet racially biased behaviors endure.

One potential remedy is to incorporate AAE and other language variants into the human feedback training framework for LLMs. However, this strategy only partially addresses the issue. The extensive datasets used to train these models inevitably contain materials from eras and communities where racism was more prevalent. While pre-training filtration can remove some of this content, enough remains that influences the resulting models.

The fight against bias in AI is likely to be continual, necessitating ongoing efforts to refine