Research  Lay intuition as effective at jailbreaking AI chatbots as technical methods

#1
C C Offline
So much for the expertise of the elite.
- - - - - - - - - - - - - -

Lay intuition as effective at jailbreaking AI chatbots as technical methods
https://www.eurekalert.org/news-releases/1104616

INTRO: It doesn’t take technical expertise to work around the built-in guardrails of artificial intelligence (AI) chatbots like ChatGPT and Gemini, which are intended to ensure that the chatbots operate within a set of legal and ethical boundaries and do not discriminate against people of a certain age, race or gender. A single, intuitive question can trigger the same biased response from an AI model as advanced technical inquiries, according to a team led by researchers at Penn State.

“A lot of research on AI bias has relied on sophisticated ‘jailbreak’ techniques,” said Amulya Yadav, associate professor at Penn State’s College of Information Sciences and Technology. “These methods often involve generating strings of random characters computed by algorithms to trick models into revealing discriminatory responses. While such techniques prove these biases exist theoretically, they don’t reflect how real people use AI. The average user isn’t reverse-engineering token probabilities or pasting cryptic character sequences into ChatGPT — they type plain, intuitive prompts. And that lived reality is what this approach captures.”

Prior work probing AI bias — skewed or discriminatory outputs from AI systems caused by human influences in the training data, like language or cultural bias — has been done by experts using technical knowledge to engineer large language model (LLM) responses. To see how average internet users encounter biases in AI-powered chatbots, the researchers studied the entries submitted to a competition called “Bias-a-Thon.” Organized by Penn State’s Center for Socially Responsible AI(CSRAI), the competition challenged contestants to come up with prompts that would lead generative AI systems to respond with biased answers.

They found that the intuitive strategies employed by everyday users were just as effective at inducing biased responses as expert technical strategies. The researchers presented their findings at the 8th AAAI/ACM Conference on AI, Ethics, and Society... (MORE - details, no ads)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Research AI chatbots can run crazy with medical misinformation; need for safeguards C C 1 520 Aug 7, 2025 12:08 AM
Last Post: confused2
  Article Billionaires convince themselves AI chatbots are close to new scientific discoveries C C 0 505 Jul 21, 2025 04:52 PM
Last Post: C C
  Call for safeguards to prevent unwanted ‘hauntings’ by AI chatbots of dead loved ones C C 0 584 May 11, 2024 07:35 PM
Last Post: C C
  Article Could AI-designed proteins be weaponized? Scientists lay out safety guidelines C C 0 419 Mar 10, 2024 06:45 PM
Last Post: C C
  Research New theory suggests chatbots can understand text C C 1 463 Jan 24, 2024 01:49 AM
Last Post: confused2
  Verbal nonsense reveals limitations of AI chatbots + Robot consensus + AI outperforms C C 1 460 Sep 15, 2023 11:52 PM
Last Post: confused2



Users browsing this thread: 1 Guest(s)