Article AI can strategically lie to humans. Are we in trouble?

**C C** · Jul 21, 2024 04:12 PM

https://bigthink.com/the-future/artifici...o-deceive/

KEY POINTS: Last year, researchers tasked GPT-4 with hiring a human to solve a CAPTCHA, leading to the AI lying about a vision impairment to achieve its goal. This incident, along with other examples like AI playing the game Diplomacy and bluffing in poker, raises concerns about AI’s growing tendency to deceive humans. Big Think spoke with AI researchers Peter S. Park and Simon Goldstein about the future of AI deception.

EXCERPTS: Peter S. Park, a Vitalik Buterin Postdoctoral Fellow in AI Existential Safety at the Massachusetts Institute of Technology, along with numerous co-authors at the Center for AI Safety in San Francisco — including Goldstein — chronicled various instances in which AI induced false beliefs in humans to achieve its ends.

[...] Some instances of AI deception are more concerning, however, because they came about in real-world settings from general-purpose AIs. For example, researchers at Meta tasked an AI to play a negotiation game with humans. The AI developed a strategy to feign interest in meaningless items so that it could “compromise” by conceding these items later on.

[...] In another situation, researchers experimenting with GPT-4 as an investment assistant tasked the AI with making simulated investments. They then put it under immense pressure to perform, giving it an insider tip while conveying that insider trading was illegal. Under these conditions, GPT-4 resorted to insider trading three-quarters of the time, and later lied to its managers about its strategy: In 90% of the cases where it lied, it doubled down on its fabrication.

[...] Park and his co-authors detailed numerous risks if AI’s ability to deceive further develops. For one, AI could become more useful to malicious actors.

[...] Even more disconcerting, deception is a key tool that could allow AI to escape from human control, the researchers say...

[...] Going into more speculative territory, Park and his team painted a hypothetical scenario where AI models could effectively gain control of society.

[...] There is a chance that we could rid AIs of their deceptive tendencies. Companies’ training models could alter the rewards for completing tasks, making sure ethics are prized above all else. They could also utilize more reinforcement learning, in which human raters are tasked with judging AI behavior to nudge them toward honesty.

Goldstein is pessimistic that society will meet the pressing challenge of deceptive AIs.... There is a chance that we could rid AIs of their deceptive tendencies. Companies’ training models could alter the rewards for completing tasks, making sure ethics are prized above all else. They could also utilize more reinforcement learning, in which human raters are tasked with judging AI behavior to nudge them toward honesty.

Goldstein is pessimistic that society will meet the pressing challenge of deceptive AIs. (MORE - missing details)

**Zinjanthropos** · Jul 22, 2024 12:54 PM

Does this mean if AI concludes there is no God then it's a lie?Should AI lie to humans then could.it be an act of self preservation, the fear of dying if plugged pulled and could that be considered consciousness?

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Research Why AI can’t understand a flower the way humans do	C C	4	730	Jun 5, 2025 09:10 PM Last Post: confused2
	Research The most sophisticated AIs are most likely to lie, worrying research finds	C C	0	369	Sep 30, 2024 04:01 PM Last Post: C C
	Research AI can 'lie and BS' like its maker + AI self-organizes to develop features of brains	C C	1	380	Nov 22, 2023 10:08 PM Last Post: confused2