Research Lay intuition as effective at jailbreaking AI chatbots as technical methods

**C C** · Nov 5, 2025 01:23 AM

So much for the expertise of the elite.
- - - - - - - - - - - - - -

Lay intuition as effective at jailbreaking AI chatbots as technical methods
https://www.eurekalert.org/news-releases/1104616

INTRO: It doesn’t take technical expertise to work around the built-in guardrails of artificial intelligence (AI) chatbots like ChatGPT and Gemini, which are intended to ensure that the chatbots operate within a set of legal and ethical boundaries and do not discriminate against people of a certain age, race or gender. A single, intuitive question can trigger the same biased response from an AI model as advanced technical inquiries, according to a team led by researchers at Penn State.

“A lot of research on AI bias has relied on sophisticated ‘jailbreak’ techniques,” said Amulya Yadav, associate professor at Penn State’s College of Information Sciences and Technology. “These methods often involve generating strings of random characters computed by algorithms to trick models into revealing discriminatory responses. While such techniques prove these biases exist theoretically, they don’t reflect how real people use AI. The average user isn’t reverse-engineering token probabilities or pasting cryptic character sequences into ChatGPT — they type plain, intuitive prompts. And that lived reality is what this approach captures.”

Prior work probing AI bias — skewed or discriminatory outputs from AI systems caused by human influences in the training data, like language or cultural bias — has been done by experts using technical knowledge to engineer large language model (LLM) responses. To see how average internet users encounter biases in AI-powered chatbots, the researchers studied the entries submitted to a competition called “Bias-a-Thon.” Organized by Penn State’s Center for Socially Responsible AI(CSRAI), the competition challenged contestants to come up with prompts that would lead generative AI systems to respond with biased answers.

They found that the intuitive strategies employed by everyday users were just as effective at inducing biased responses as expert technical strategies. The researchers presented their findings at the 8th AAAI/ACM Conference on AI, Ethics, and Society... (MORE - details, no ads)

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Research AI chatbots can effectively sway voters – in either political direction	C C	0	181	Dec 5, 2025 01:31 AM Last Post: C C
	Research AI chatbots can run crazy with medical misinformation; need for safeguards	C C	1	626	Aug 7, 2025 12:08 AM Last Post: confused2
	Article Billionaires convince themselves AI chatbots are close to new scientific discoveries	C C	0	595	Jul 21, 2025 04:52 PM Last Post: C C
	Call for safeguards to prevent unwanted ‘hauntings’ by AI chatbots of dead loved ones	C C	0	657	May 11, 2024 07:35 PM Last Post: C C
	Article Could AI-designed proteins be weaponized? Scientists lay out safety guidelines	C C	0	503	Mar 10, 2024 06:45 PM Last Post: C C
	Research New theory suggests chatbots can understand text	C C	1	607	Jan 24, 2024 01:49 AM Last Post: confused2
	Verbal nonsense reveals limitations of AI chatbots + Robot consensus + AI outperforms	C C	1	559	Sep 15, 2023 11:52 PM Last Post: confused2