
“Us” vs. “them” biases plague AI, too
https://www.eurekalert.org/news-releases/1067866
INTRO: Research has long shown that humans are susceptible to “social identity bias”—favoring their group, whether that be a political party, a religion, or an ethnicity, and disparaging “outgroups.” A new study by a team of scientists finds that AI systems are also prone to the same type of biases, revealing fundamental group prejudices that reach beyond those tied to gender, race, or religion.
“Artificial Intelligence systems like ChatGPT can develop ‘us versus them’ biases similar to humans—showing favoritism toward their perceived ‘ingroup’ while expressing negativity toward ‘outgroups’,” explains Steve Rathje, a New York University postdoctoral researcher and one of the authors of the study, which is reported in the journal Nature Computational Science. “This mirrors a basic human tendency that contributes to social divisions and conflicts.”
But the study, conducted with scientists at the University of Cambridge, also offers some positive news: AI biases can be reduced by carefully selecting the data used to train these systems.
“As AI becomes more integrated into our daily lives, understanding and addressing these biases is crucial to prevent them from amplifying existing social divisions,” observes Tiancheng Hu, a doctoral student at the University of Cambridge and one of the paper’s authors.
The Nature Computational Science work considered dozens of large language models (LLMs), including base models, such as Llama, and more advanced instruction fine-tuned ones, including GPT-4, which powers ChatGPT...
[...] they “fine-tuned” the LLM with partisan social media data from Twitter (now X) and found a significant increase in both ingroup solidarity and outgroup hostility. Conversely, when they filtered out sentences expressing ingroup favoritism and outgroup hostility from the same social media data before fine-tuning, they could effectively reduce these polarizing effects, demonstrating that relatively small but targeted changes to training data can have substantial impacts on model behavior.
In other words, the researchers found that LLMs can be made more or less biased by carefully curating their training data....(MORE - details, no ads)
https://www.eurekalert.org/news-releases/1067866
INTRO: Research has long shown that humans are susceptible to “social identity bias”—favoring their group, whether that be a political party, a religion, or an ethnicity, and disparaging “outgroups.” A new study by a team of scientists finds that AI systems are also prone to the same type of biases, revealing fundamental group prejudices that reach beyond those tied to gender, race, or religion.
“Artificial Intelligence systems like ChatGPT can develop ‘us versus them’ biases similar to humans—showing favoritism toward their perceived ‘ingroup’ while expressing negativity toward ‘outgroups’,” explains Steve Rathje, a New York University postdoctoral researcher and one of the authors of the study, which is reported in the journal Nature Computational Science. “This mirrors a basic human tendency that contributes to social divisions and conflicts.”
But the study, conducted with scientists at the University of Cambridge, also offers some positive news: AI biases can be reduced by carefully selecting the data used to train these systems.
“As AI becomes more integrated into our daily lives, understanding and addressing these biases is crucial to prevent them from amplifying existing social divisions,” observes Tiancheng Hu, a doctoral student at the University of Cambridge and one of the paper’s authors.
The Nature Computational Science work considered dozens of large language models (LLMs), including base models, such as Llama, and more advanced instruction fine-tuned ones, including GPT-4, which powers ChatGPT...
[...] they “fine-tuned” the LLM with partisan social media data from Twitter (now X) and found a significant increase in both ingroup solidarity and outgroup hostility. Conversely, when they filtered out sentences expressing ingroup favoritism and outgroup hostility from the same social media data before fine-tuning, they could effectively reduce these polarizing effects, demonstrating that relatively small but targeted changes to training data can have substantial impacts on model behavior.
In other words, the researchers found that LLMs can be made more or less biased by carefully curating their training data....(MORE - details, no ads)