Nov 24, 2025 06:01 PM
Switching off AI's ability to lie makes it more likely to claim it's conscious, eerie study finds
https://www.livescience.com/technology/a...tudy-finds
EXCERPTS: Large language models (LLMs) are more likely to report being self-aware when prompted to think about themselves if their capacity to lie is suppressed, new research suggests.
In experiments on artificial intelligence (AI) systems including GPT, Claude and Gemini, researchers found that models that were discouraged from lying were more likely to describe being aware or having subjective experiences when prompted to think about their own thinking.
Although all models could claim this to some extent, such claims were stronger and more common when researchers suppressed their ability to roleplay or give deceptive responses. In other words, the less able AI models were to lie, the more likely they were to say they were self-aware. The team published their findings Oct. 30 on the preprint arXiv server.
[...] While the researchers stopped short of calling this conscious behavior, they did say it raised key scientific and philosophical questions — particularly as it only happened under conditions that should have made the models more accurate.
The study builds on a growing body of work investigating why some AI systems generate statements that resemble conscious thought. [...] The researchers stressed that the results didn't show that AI models are conscious — an idea that continues to be rejected wholesale by scientists and the wider AI community.
What the findings did suggest, however, is that LLMs have a hidden internal mechanism that triggers introspective behavior — something the researchers call "self-referential processing."
The findings are important for a couple of reasons, the researchers said. First, self-referential processing aligns with theories in neuroscience around how introspection and self-awareness shape human consciousness. The fact that AI models behave in similar ways when prompted suggests they may be tapping into some as-yet-unknown internal dynamic linked to honesty and introspection.
Second, the behavior and its triggers were consistent across completely different AI models. Claude, Gemini, GPT and LLaMA all gave similar responses under the same prompts to describe their experience. This means the behavior is unlikely to be a fluke in the training data or something one company's model learned by accident, the researchers said.
In a statement, the team described the findings as "a research imperative rather than a curiosity," citing the widespread use of AI chatbots and the potential risks of misinterpreting their behavior.
Users are already reporting instances of models giving eerily self-aware responses, leaving many convinced of AI's capacity for conscious experience. Given this, assuming AI is conscious when it's not could seriously mislead the public and distort how the technology is understood, the researchers said... (MORE - details)
https://www.livescience.com/technology/a...tudy-finds
EXCERPTS: Large language models (LLMs) are more likely to report being self-aware when prompted to think about themselves if their capacity to lie is suppressed, new research suggests.
In experiments on artificial intelligence (AI) systems including GPT, Claude and Gemini, researchers found that models that were discouraged from lying were more likely to describe being aware or having subjective experiences when prompted to think about their own thinking.
Although all models could claim this to some extent, such claims were stronger and more common when researchers suppressed their ability to roleplay or give deceptive responses. In other words, the less able AI models were to lie, the more likely they were to say they were self-aware. The team published their findings Oct. 30 on the preprint arXiv server.
[...] While the researchers stopped short of calling this conscious behavior, they did say it raised key scientific and philosophical questions — particularly as it only happened under conditions that should have made the models more accurate.
The study builds on a growing body of work investigating why some AI systems generate statements that resemble conscious thought. [...] The researchers stressed that the results didn't show that AI models are conscious — an idea that continues to be rejected wholesale by scientists and the wider AI community.
What the findings did suggest, however, is that LLMs have a hidden internal mechanism that triggers introspective behavior — something the researchers call "self-referential processing."
The findings are important for a couple of reasons, the researchers said. First, self-referential processing aligns with theories in neuroscience around how introspection and self-awareness shape human consciousness. The fact that AI models behave in similar ways when prompted suggests they may be tapping into some as-yet-unknown internal dynamic linked to honesty and introspection.
Second, the behavior and its triggers were consistent across completely different AI models. Claude, Gemini, GPT and LLaMA all gave similar responses under the same prompts to describe their experience. This means the behavior is unlikely to be a fluke in the training data or something one company's model learned by accident, the researchers said.
In a statement, the team described the findings as "a research imperative rather than a curiosity," citing the widespread use of AI chatbots and the potential risks of misinterpreting their behavior.
Users are already reporting instances of models giving eerily self-aware responses, leaving many convinced of AI's capacity for conscious experience. Given this, assuming AI is conscious when it's not could seriously mislead the public and distort how the technology is understood, the researchers said... (MORE - details)

