<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/">
	<channel>
		<title><![CDATA[Scivillage.com Casual Discussion Science Forum - Computer Sci., Programming & Intelligence]]></title>
		<link>https://www.scivillage.com/</link>
		<description><![CDATA[Scivillage.com Casual Discussion Science Forum - https://www.scivillage.com]]></description>
		<pubDate>Wed, 29 Apr 2026 13:21:58 +0000</pubDate>
		<generator>MyBB</generator>
		<item>
			<title><![CDATA[How we will overthrow our future AI overlords]]></title>
			<link>https://www.scivillage.com/thread-20297.html</link>
			<pubDate>Mon, 27 Apr 2026 00:20:12 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=9">Magical Realist</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20297.html</guid>
			<description><![CDATA[Everything we need to know about the problems of the future has been revealed in old Star Trek episodes. Captain Kirk taught us everything!<br />
<br />
<a href="https://www.youtube.com/watch?v=WsNQTfZj4o8" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.youtube.com/watch?v=WsNQTfZj4o8</a>]]></description>
			<content:encoded><![CDATA[Everything we need to know about the problems of the future has been revealed in old Star Trek episodes. Captain Kirk taught us everything!<br />
<br />
<a href="https://www.youtube.com/watch?v=WsNQTfZj4o8" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.youtube.com/watch?v=WsNQTfZj4o8</a>]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[Why faster AI isn’t always better]]></title>
			<link>https://www.scivillage.com/thread-20278.html</link>
			<pubDate>Sat, 25 Apr 2026 17:31:21 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=6">C C</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20278.html</guid>
			<description><![CDATA[<a href="https://engineering.nyu.edu/news/why-faster-ai-isnt-always-better" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://engineering.nyu.edu/news/why-fas...ays-better</a><br />
<br />
PRESS RELEASE: In the race to make AI models not just reason better but respond faster, latency — the delay before an answer appears — is often treated as a purely technical constraint, something to minimize and move past. But how is this relentless push for speed actually impacting the people using these systems every day?<br />
<br />
There is a rich body of work in human-computer interaction linking faster response times to better usability. But AI models are fundamentally different from the deterministic systems that previous research was built on. When you wait for a file to download or a page to load, the outcome is fixed and predictable. AI models are probabilistic — you cannot anticipate the precise response. Their conversational interface means users naturally read human social cues into the interaction. A pause might be read as the AI "thinking," for instance. Users are increasingly asked to choose between faster models and slower, deeper-reasoning ones, without guidance on what that choice actually means for their experience.<br />
<br />
A recent <a href="https://dl.acm.org/doi/full/10.1145/3772318.3790716" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">study presented at CHI’26</a> explored how response timing shapes the way people use and evaluate AI systems. Felicia Fang-Yi Tan and Technology Management and Innovation Professor Oded Nov recruited 240 participants and asked them to complete common knowledge work tasks using a chatbot. Some tasks focused on creation, such as brainstorming ideas or drafting text. Others centered on advice, like evaluating decisions or offering recommendations. Crucially, the system was engineered to respond at different speeds. Some participants received answers after just two seconds, while others waited nine or even twenty seconds.<br />
<br />
The results challenge a long standing assumption in human-computer interaction that faster is always better.<br />
<br />
“People assume faster AI is better, but our findings show that timing actually shapes how intelligence is perceived,” says Tan. “A short pause can signal care and deliberation, making the same response feel more thoughtful and useful, even when nothing about the underlying AI model has changed.”<br />
<br />
Surprisingly, how quickly the AI responded did not significantly change how people behaved (e.g., frequency of prompting, copy-pasting). Participants prompted just as much and interacted with the system in broadly similar ways regardless of whether they waited two seconds or twenty. Instead, behavior depended more on the type of task. Participants attempting creation tasks (which involve producing new content such as writing) prompted more back and forth, with users refining and iterating on ideas. Advice tasks (which involve providing guidance, critique, or evaluation) led to fewer, more focused exchanges.<br />
<br />
Where timing did matter was in perception. Participants who received two-second responses consistently rated the AI’s answers as less thoughtful and less useful. In contrast, those who experienced longer delays tended to view the same kinds of responses more favorably. Many interpreted the pause as a sign that the system was “thinking,” attributing greater care and deliberation to its output.<br />
<br />
This effect highlights a subtle but powerful feature of human psychology. In everyday conversation, pauses carry meaning. A quick reply can feel impulsive, while a measured delay suggests reflection. People appear to apply these same social expectations to machines, even when they know they are interacting with software.<br />
<br />
The implications extend beyond user experience. Given that latency is an inherent feature of today's AI models, perhaps the more productive question is not how to eliminate it, but what it can be designed to do. Positive friction refers to intentional slowdowns designed to promote cognitive benefits such as reflection. Rather than treating every millisecond of waiting as waste, designers might ask: what can this pause do?<br />
<br />
The study also surfaces important ethical considerations. If people equate longer response times with higher quality, they may place undue trust in slower systems, regardless of whether the output is actually better. This raises ethical questions about whether AI systems should be designed to manage timing in ways that shape user perception. And if so, whether users should be informed when they are.]]></description>
			<content:encoded><![CDATA[<a href="https://engineering.nyu.edu/news/why-faster-ai-isnt-always-better" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://engineering.nyu.edu/news/why-fas...ays-better</a><br />
<br />
PRESS RELEASE: In the race to make AI models not just reason better but respond faster, latency — the delay before an answer appears — is often treated as a purely technical constraint, something to minimize and move past. But how is this relentless push for speed actually impacting the people using these systems every day?<br />
<br />
There is a rich body of work in human-computer interaction linking faster response times to better usability. But AI models are fundamentally different from the deterministic systems that previous research was built on. When you wait for a file to download or a page to load, the outcome is fixed and predictable. AI models are probabilistic — you cannot anticipate the precise response. Their conversational interface means users naturally read human social cues into the interaction. A pause might be read as the AI "thinking," for instance. Users are increasingly asked to choose between faster models and slower, deeper-reasoning ones, without guidance on what that choice actually means for their experience.<br />
<br />
A recent <a href="https://dl.acm.org/doi/full/10.1145/3772318.3790716" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">study presented at CHI’26</a> explored how response timing shapes the way people use and evaluate AI systems. Felicia Fang-Yi Tan and Technology Management and Innovation Professor Oded Nov recruited 240 participants and asked them to complete common knowledge work tasks using a chatbot. Some tasks focused on creation, such as brainstorming ideas or drafting text. Others centered on advice, like evaluating decisions or offering recommendations. Crucially, the system was engineered to respond at different speeds. Some participants received answers after just two seconds, while others waited nine or even twenty seconds.<br />
<br />
The results challenge a long standing assumption in human-computer interaction that faster is always better.<br />
<br />
“People assume faster AI is better, but our findings show that timing actually shapes how intelligence is perceived,” says Tan. “A short pause can signal care and deliberation, making the same response feel more thoughtful and useful, even when nothing about the underlying AI model has changed.”<br />
<br />
Surprisingly, how quickly the AI responded did not significantly change how people behaved (e.g., frequency of prompting, copy-pasting). Participants prompted just as much and interacted with the system in broadly similar ways regardless of whether they waited two seconds or twenty. Instead, behavior depended more on the type of task. Participants attempting creation tasks (which involve producing new content such as writing) prompted more back and forth, with users refining and iterating on ideas. Advice tasks (which involve providing guidance, critique, or evaluation) led to fewer, more focused exchanges.<br />
<br />
Where timing did matter was in perception. Participants who received two-second responses consistently rated the AI’s answers as less thoughtful and less useful. In contrast, those who experienced longer delays tended to view the same kinds of responses more favorably. Many interpreted the pause as a sign that the system was “thinking,” attributing greater care and deliberation to its output.<br />
<br />
This effect highlights a subtle but powerful feature of human psychology. In everyday conversation, pauses carry meaning. A quick reply can feel impulsive, while a measured delay suggests reflection. People appear to apply these same social expectations to machines, even when they know they are interacting with software.<br />
<br />
The implications extend beyond user experience. Given that latency is an inherent feature of today's AI models, perhaps the more productive question is not how to eliminate it, but what it can be designed to do. Positive friction refers to intentional slowdowns designed to promote cognitive benefits such as reflection. Rather than treating every millisecond of waiting as waste, designers might ask: what can this pause do?<br />
<br />
The study also surfaces important ethical considerations. If people equate longer response times with higher quality, they may place undue trust in slower systems, regardless of whether the output is actually better. This raises ethical questions about whether AI systems should be designed to manage timing in ways that shape user perception. And if so, whether users should be informed when they are.]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[AI voices are easier to understand than human voices]]></title>
			<link>https://www.scivillage.com/thread-20246.html</link>
			<pubDate>Tue, 21 Apr 2026 20:39:12 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=6">C C</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20246.html</guid>
			<description><![CDATA[<a href="https://acoustics.org/ai-voices-are-easier-to-understand-than-human-voices/" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://acoustics.org/ai-voices-are-easi...an-voices/</a><br />
<br />
PRESS RELEASE: Synthetic voices are increasingly a part of our lives, from digital assistants like Siri and Alexa to automated telemarketers and answering machines. With the expansion of generative AI, a new type of synthetic voice has been developed: voice clones, which can recreate a facsimile of a person’s voice from only a few seconds of recorded speech.<br />
<br />
In JASA, published on behalf of the Acoustical Society of America by AIP Publishing, a pair of researchers from University College London and the University of Roehampton evaluated the intelligibility of humans and voice clones. They found that voice clones are easier than humans to understand in noisy environments.<br />
<br />
Voice clones differ from traditional synthetic voices in the amount of sampling they require. Synthetic voices like Siri require a voice actor to spend hours in a recording booth. In contrast, a voice clone can be made from as little as 10 seconds of speech, significantly expanding the number of potential voices as well as the number of potential applications.<br />
<br />
Researchers Patti Adank and Han Wang specialize in studying human perception of unclear speech and were fascinated by the idea of machine-replicated speech. A key question they were looking to answer was just how easy voice clones are for the average person to understand. They suspected that voice clones would simply be poor representations of actual human voices and that people would struggle to understand them. What they found could not be more different.<br />
<br />
“I thought initially that voice clones would be less intelligible because they were unfamiliar,” said Adank. “I found they were up to 20% more intelligible, which was quite shocking. A small part of our paper is talking about that experiment, and then a large part is me and my collaborator frantically trying to find out what it is that makes those voice clones more intelligible.”<br />
<br />
The duo initially presented volunteers with human voices and voice clones, asking them to rate their intelligibility. After finding that voice clones were consistently rated easier to understand, they repeated the experiment with elderly volunteers to determine if being hard-of-hearing alters the effect; with American volunteers — the original cohort was British — to judge if the accent plays a role; and with a filter designed to mimic cochlear implants. In every case, voice clones emerged victorious.<br />
<br />
After examining over 100 acoustic measurements, Adank believes the only way to solve the mystery is to work with collaborators who specialize in text-to-speech systems to adapt an existing open-source cloning system.<br />
<br />
“I am now going to try and recreate [the effect] by studying how synthesizers work and how they use digital signal processing to generate those voices, just to get a bit of a handle on this,” said Adank.<br />
<br />
PAPER: <a href="http://dx.doi.org/10.1121/10.0043094" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">http://dx.doi.org/10.1121/10.0043094</a>]]></description>
			<content:encoded><![CDATA[<a href="https://acoustics.org/ai-voices-are-easier-to-understand-than-human-voices/" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://acoustics.org/ai-voices-are-easi...an-voices/</a><br />
<br />
PRESS RELEASE: Synthetic voices are increasingly a part of our lives, from digital assistants like Siri and Alexa to automated telemarketers and answering machines. With the expansion of generative AI, a new type of synthetic voice has been developed: voice clones, which can recreate a facsimile of a person’s voice from only a few seconds of recorded speech.<br />
<br />
In JASA, published on behalf of the Acoustical Society of America by AIP Publishing, a pair of researchers from University College London and the University of Roehampton evaluated the intelligibility of humans and voice clones. They found that voice clones are easier than humans to understand in noisy environments.<br />
<br />
Voice clones differ from traditional synthetic voices in the amount of sampling they require. Synthetic voices like Siri require a voice actor to spend hours in a recording booth. In contrast, a voice clone can be made from as little as 10 seconds of speech, significantly expanding the number of potential voices as well as the number of potential applications.<br />
<br />
Researchers Patti Adank and Han Wang specialize in studying human perception of unclear speech and were fascinated by the idea of machine-replicated speech. A key question they were looking to answer was just how easy voice clones are for the average person to understand. They suspected that voice clones would simply be poor representations of actual human voices and that people would struggle to understand them. What they found could not be more different.<br />
<br />
“I thought initially that voice clones would be less intelligible because they were unfamiliar,” said Adank. “I found they were up to 20% more intelligible, which was quite shocking. A small part of our paper is talking about that experiment, and then a large part is me and my collaborator frantically trying to find out what it is that makes those voice clones more intelligible.”<br />
<br />
The duo initially presented volunteers with human voices and voice clones, asking them to rate their intelligibility. After finding that voice clones were consistently rated easier to understand, they repeated the experiment with elderly volunteers to determine if being hard-of-hearing alters the effect; with American volunteers — the original cohort was British — to judge if the accent plays a role; and with a filter designed to mimic cochlear implants. In every case, voice clones emerged victorious.<br />
<br />
After examining over 100 acoustic measurements, Adank believes the only way to solve the mystery is to work with collaborators who specialize in text-to-speech systems to adapt an existing open-source cloning system.<br />
<br />
“I am now going to try and recreate [the effect] by studying how synthesizers work and how they use digital signal processing to generate those voices, just to get a bit of a handle on this,” said Adank.<br />
<br />
PAPER: <a href="http://dx.doi.org/10.1121/10.0043094" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">http://dx.doi.org/10.1121/10.0043094</a>]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[Bill Maher's poignant rant on the dangers of AI]]></title>
			<link>https://www.scivillage.com/thread-20227.html</link>
			<pubDate>Sun, 19 Apr 2026 01:02:11 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=9">Magical Realist</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20227.html</guid>
			<description><![CDATA[Preach it Bill!<br />
<br />
<a href="https://www.youtube.com/watch?v=w5SYm4J4utQ" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.youtube.com/watch?v=w5SYm4J4utQ</a>]]></description>
			<content:encoded><![CDATA[Preach it Bill!<br />
<br />
<a href="https://www.youtube.com/watch?v=w5SYm4J4utQ" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.youtube.com/watch?v=w5SYm4J4utQ</a>]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[AI slop's stylistic tell..]]></title>
			<link>https://www.scivillage.com/thread-20208.html</link>
			<pubDate>Thu, 16 Apr 2026 20:32:16 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=9">Magical Realist</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20208.html</guid>
			<description><![CDATA[I've noticed this too. Tons of articles in my newsfeed repeating the same catch phrase: "It's not this. It's THIS." I suspected that was a sign it was generated by AI. Now I have it confirmed:<br />
<br />
“It’s not X, it’s Y” is an AI mainstay. It’s one of ChatGPT’s most insidious tells. No matter how innocuous a prompt you enter, AI will always find a way to sneak it into its response. Ask it if you should put more ham in your pasta, and it will tell you: “Ham doesn’t just taste good – it makes everything else taste better.” Ask it if you should chase a bee around your garden and it will say: “Bees aren’t stupid – they’re hyper-specialised”.<br />
<br />
“It’s not X, it’s Y” has become such a shorthand for lazy AI slop that, as soon as I see or hear someone telling me that something isn’t something because it’s actually something else, I automatically tense up on the assumption that I’m not dealing with a human, I’m dealing with a datacentre. That might not necessarily be the case – there is a possibility every example is completely organic – but it’s a sign of the times that we can’t just relax and assume the things we see and hear were made by people.<br />
<br />
Although “it’s not X, it’s Y” predates ChatGPT, I cannot hear it without assuming that AI made it. A few weeks ago, I was rewatching the Mad Men episode where Don Draper pitches a watch. “It’s not a timepiece,” he says. “It’s a conversation piece.” A decade ago, I was amazed by Draper’s elegant turn of phrase. But now I can’t see it without thinking that a chatbot vomited it out between daytime scotches.<br />
<br />
There are plenty of other linguistic gimmicks that appear to come direct from ChatGPT. Vague, soft intensifiers are one: if you ever see anything described as “quietly powerful” or “deeply transformative”, then that should set your spidey-senses tingling. ChatGPT is also known for being a bit too liberal with em-dashes. So am I, but the robots can rip them from my cold dead hands. Nevertheless, nothing haunts me quite as much as “it’s not X, it’s Y”.<br />
<br />
This is my life now. I’ve become so hypervigilant to the construction that it has seeped into my subconscious thoughts. This isn’t a cup of tea, I say out loud to myself, it’s a precious respite. That isn’t a window, it’s a portal to a new way of thinking. This isn’t food poisoning, it’s a quietly powerful reminder not to eat raw chicken off the kitchen floor.<br />
<br />
So now, whenever I sit down at my desk, I waste all my energy trying not to write any variation of “it’s not X, it’s Y”, because I don’t want you to think I use AI. It’s much harder than it looks. I literally used it four paragraphs ago, with the datacentre thing. It has made me even more determined to prove that I’m a human. What do I need to do? Come to your house and free-associate a column at you? Send out vials of my saliva? I’ll do whatever it takes.<br />
<br />
Hopefully this won’t be for ever. AI evolves so quickly that “it’s not X, it’s Y” will soon become a thing of the past. It will probably be replaced by a new stylistic quirk, no less sinister but harder to detect. And if that doesn’t happen, you have my full permission to lock me away for my own safety. “This isn’t incarceration,” you can tell me as you slam the door. “It’s a quiet reset.”--- <a href="https://www.theguardian.com/commentisfree/2026/apr/15/chatgpt-stylistic-quirk-its-not-x-its-y?CMP=fb_gu&amp;utm_medium=Social&amp;utm_source=Facebook&amp;fbclid=IwY2xjawROJKpleHRuA2FlbQIxMABicmlkETF0MlVjWVp0cTlCZkxYVGxwc3J0YwZhcHBfaWQQMjIyMDM5MTc4ODIwMDg5MgABHsAScuauirYFcEwOxs40x1BQIEKiI6IfoAivxBhr1U1nyJNx4S5RYYVB81QF_aem_pYN-bRyMY3T1UvNqj0d7WQ#Echobox=1776320714" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.theguardian.com/commentisfre...1776320714</a>]]></description>
			<content:encoded><![CDATA[I've noticed this too. Tons of articles in my newsfeed repeating the same catch phrase: "It's not this. It's THIS." I suspected that was a sign it was generated by AI. Now I have it confirmed:<br />
<br />
“It’s not X, it’s Y” is an AI mainstay. It’s one of ChatGPT’s most insidious tells. No matter how innocuous a prompt you enter, AI will always find a way to sneak it into its response. Ask it if you should put more ham in your pasta, and it will tell you: “Ham doesn’t just taste good – it makes everything else taste better.” Ask it if you should chase a bee around your garden and it will say: “Bees aren’t stupid – they’re hyper-specialised”.<br />
<br />
“It’s not X, it’s Y” has become such a shorthand for lazy AI slop that, as soon as I see or hear someone telling me that something isn’t something because it’s actually something else, I automatically tense up on the assumption that I’m not dealing with a human, I’m dealing with a datacentre. That might not necessarily be the case – there is a possibility every example is completely organic – but it’s a sign of the times that we can’t just relax and assume the things we see and hear were made by people.<br />
<br />
Although “it’s not X, it’s Y” predates ChatGPT, I cannot hear it without assuming that AI made it. A few weeks ago, I was rewatching the Mad Men episode where Don Draper pitches a watch. “It’s not a timepiece,” he says. “It’s a conversation piece.” A decade ago, I was amazed by Draper’s elegant turn of phrase. But now I can’t see it without thinking that a chatbot vomited it out between daytime scotches.<br />
<br />
There are plenty of other linguistic gimmicks that appear to come direct from ChatGPT. Vague, soft intensifiers are one: if you ever see anything described as “quietly powerful” or “deeply transformative”, then that should set your spidey-senses tingling. ChatGPT is also known for being a bit too liberal with em-dashes. So am I, but the robots can rip them from my cold dead hands. Nevertheless, nothing haunts me quite as much as “it’s not X, it’s Y”.<br />
<br />
This is my life now. I’ve become so hypervigilant to the construction that it has seeped into my subconscious thoughts. This isn’t a cup of tea, I say out loud to myself, it’s a precious respite. That isn’t a window, it’s a portal to a new way of thinking. This isn’t food poisoning, it’s a quietly powerful reminder not to eat raw chicken off the kitchen floor.<br />
<br />
So now, whenever I sit down at my desk, I waste all my energy trying not to write any variation of “it’s not X, it’s Y”, because I don’t want you to think I use AI. It’s much harder than it looks. I literally used it four paragraphs ago, with the datacentre thing. It has made me even more determined to prove that I’m a human. What do I need to do? Come to your house and free-associate a column at you? Send out vials of my saliva? I’ll do whatever it takes.<br />
<br />
Hopefully this won’t be for ever. AI evolves so quickly that “it’s not X, it’s Y” will soon become a thing of the past. It will probably be replaced by a new stylistic quirk, no less sinister but harder to detect. And if that doesn’t happen, you have my full permission to lock me away for my own safety. “This isn’t incarceration,” you can tell me as you slam the door. “It’s a quiet reset.”--- <a href="https://www.theguardian.com/commentisfree/2026/apr/15/chatgpt-stylistic-quirk-its-not-x-its-y?CMP=fb_gu&amp;utm_medium=Social&amp;utm_source=Facebook&amp;fbclid=IwY2xjawROJKpleHRuA2FlbQIxMABicmlkETF0MlVjWVp0cTlCZkxYVGxwc3J0YwZhcHBfaWQQMjIyMDM5MTc4ODIwMDg5MgABHsAScuauirYFcEwOxs40x1BQIEKiI6IfoAivxBhr1U1nyJNx4S5RYYVB81QF_aem_pYN-bRyMY3T1UvNqj0d7WQ#Echobox=1776320714" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.theguardian.com/commentisfre...1776320714</a>]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[AI slop's stylistic tell..]]></title>
			<link>https://www.scivillage.com/thread-20207.html</link>
			<pubDate>Thu, 16 Apr 2026 20:31:09 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=9">Magical Realist</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20207.html</guid>
			<description><![CDATA[I've noticed this too. Tons of articles in my new feed repeating the same catch phrase: "It's not this. It's THIS." I suspected that was a sign it was generated by AI. Now I have it confirmed:<br />
<br />
“It’s not X, it’s Y” is an AI mainstay. It’s one of ChatGPT’s most insidious tells. No matter how innocuous a prompt you enter, AI will always find a way to sneak it into its response. Ask it if you should put more ham in your pasta, and it will tell you: “Ham doesn’t just taste good – it makes everything else taste better.” Ask it if you should chase a bee around your garden and it will say: “Bees aren’t stupid – they’re hyper-specialised”.<br />
<br />
If you ever see anything described as ‘quietly powerful’, that should set your spidey-senses tingling<br />
“It’s not X, it’s Y” has become such a shorthand for lazy AI slop that, as soon as I see or hear someone telling me that something isn’t something because it’s actually something else, I automatically tense up on the assumption that I’m not dealing with a human, I’m dealing with a datacentre. That might not necessarily be the case – there is a possibility every example is completely organic – but it’s a sign of the times that we can’t just relax and assume the things we see and hear were made by people.<br />
<br />
Although “it’s not X, it’s Y” predates ChatGPT, I cannot hear it without assuming that AI made it. A few weeks ago, I was rewatching the Mad Men episode where Don Draper pitches a watch. “It’s not a timepiece,” he says. “It’s a conversation piece.” A decade ago, I was amazed by Draper’s elegant turn of phrase. But now I can’t see it without thinking that a chatbot vomited it out between daytime scotches.<br />
<br />
There are plenty of other linguistic gimmicks that appear to come direct from ChatGPT. Vague, soft intensifiers are one: if you ever see anything described as “quietly powerful” or “deeply transformative”, then that should set your spidey-senses tingling. ChatGPT is also known for being a bit too liberal with em-dashes. So am I, but the robots can rip them from my cold dead hands. Nevertheless, nothing haunts me quite as much as “it’s not X, it’s Y”.<br />
<br />
This is my life now. I’ve become so hypervigilant to the construction that it has seeped into my subconscious thoughts. This isn’t a cup of tea, I say out loud to myself, it’s a precious respite. That isn’t a window, it’s a portal to a new way of thinking. This isn’t food poisoning, it’s a quietly powerful reminder not to eat raw chicken off the kitchen floor.<br />
<br />
So now, whenever I sit down at my desk, I waste all my energy trying not to write any variation of “it’s not X, it’s Y”, because I don’t want you to think I use AI. It’s much harder than it looks. I literally used it four paragraphs ago, with the datacentre thing. It has made me even more determined to prove that I’m a human. What do I need to do? Come to your house and free-associate a column at you? Send out vials of my saliva? I’ll do whatever it takes.<br />
<br />
Hopefully this won’t be for ever. AI evolves so quickly that “it’s not X, it’s Y” will soon become a thing of the past. It will probably be replaced by a new stylistic quirk, no less sinister but harder to detect. And if that doesn’t happen, you have my full permission to lock me away for my own safety. “This isn’t incarceration,” you can tell me as you slam the door. “It’s a quiet reset.”---]]></description>
			<content:encoded><![CDATA[I've noticed this too. Tons of articles in my new feed repeating the same catch phrase: "It's not this. It's THIS." I suspected that was a sign it was generated by AI. Now I have it confirmed:<br />
<br />
“It’s not X, it’s Y” is an AI mainstay. It’s one of ChatGPT’s most insidious tells. No matter how innocuous a prompt you enter, AI will always find a way to sneak it into its response. Ask it if you should put more ham in your pasta, and it will tell you: “Ham doesn’t just taste good – it makes everything else taste better.” Ask it if you should chase a bee around your garden and it will say: “Bees aren’t stupid – they’re hyper-specialised”.<br />
<br />
If you ever see anything described as ‘quietly powerful’, that should set your spidey-senses tingling<br />
“It’s not X, it’s Y” has become such a shorthand for lazy AI slop that, as soon as I see or hear someone telling me that something isn’t something because it’s actually something else, I automatically tense up on the assumption that I’m not dealing with a human, I’m dealing with a datacentre. That might not necessarily be the case – there is a possibility every example is completely organic – but it’s a sign of the times that we can’t just relax and assume the things we see and hear were made by people.<br />
<br />
Although “it’s not X, it’s Y” predates ChatGPT, I cannot hear it without assuming that AI made it. A few weeks ago, I was rewatching the Mad Men episode where Don Draper pitches a watch. “It’s not a timepiece,” he says. “It’s a conversation piece.” A decade ago, I was amazed by Draper’s elegant turn of phrase. But now I can’t see it without thinking that a chatbot vomited it out between daytime scotches.<br />
<br />
There are plenty of other linguistic gimmicks that appear to come direct from ChatGPT. Vague, soft intensifiers are one: if you ever see anything described as “quietly powerful” or “deeply transformative”, then that should set your spidey-senses tingling. ChatGPT is also known for being a bit too liberal with em-dashes. So am I, but the robots can rip them from my cold dead hands. Nevertheless, nothing haunts me quite as much as “it’s not X, it’s Y”.<br />
<br />
This is my life now. I’ve become so hypervigilant to the construction that it has seeped into my subconscious thoughts. This isn’t a cup of tea, I say out loud to myself, it’s a precious respite. That isn’t a window, it’s a portal to a new way of thinking. This isn’t food poisoning, it’s a quietly powerful reminder not to eat raw chicken off the kitchen floor.<br />
<br />
So now, whenever I sit down at my desk, I waste all my energy trying not to write any variation of “it’s not X, it’s Y”, because I don’t want you to think I use AI. It’s much harder than it looks. I literally used it four paragraphs ago, with the datacentre thing. It has made me even more determined to prove that I’m a human. What do I need to do? Come to your house and free-associate a column at you? Send out vials of my saliva? I’ll do whatever it takes.<br />
<br />
Hopefully this won’t be for ever. AI evolves so quickly that “it’s not X, it’s Y” will soon become a thing of the past. It will probably be replaced by a new stylistic quirk, no less sinister but harder to detect. And if that doesn’t happen, you have my full permission to lock me away for my own safety. “This isn’t incarceration,” you can tell me as you slam the door. “It’s a quiet reset.”---]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[Widespread AI use narrows society’s creative space]]></title>
			<link>https://www.scivillage.com/thread-20181.html</link>
			<pubDate>Mon, 13 Apr 2026 22:58:13 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=6">C C</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20181.html</guid>
			<description><![CDATA[<a href="https://www.eurekalert.org/news-releases/1123936" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.eurekalert.org/news-releases/1123936</a><br />
<br />
EXCERPTS:  There are already hundreds of thousands of large language models (LLMs) in existence with a few dozen commercial systems dominating the market. Between options such as GPT-4, Claude and Gemini, many people have their favorite, especially when it comes to creative tasks such as writing.<br />
<br />
Those preferences, however, are likely entirely in the eye of the beholder. According to new research from Duke University, the creative outputs of commercial LLMs are more similar to each other than users might hope. When challenged with three standard tasks assessing creativity, answers from commercial LLMs are much more alike than their human counterparts.<br />
<br />
The <a href="http://dx.doi.org/10.1093/pnasnexus/pgag042" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">results appeared online March 24 in the journal Proceedings of the National Academy of Sciences Nexus</a>.<br />
<br />
“People might wonder if different LLMs will take them in different directions with the same prompts for creative projects,” said Emily Wenger, the Cue Family Assistant Professor of Electrical and Computer Engineering at Duke. “This paper basically says no. LLMs are less creative as a population than humans.”<br />
<br />
According to a 2024 survey by Adobe, over half of Americans have already used LLMs as creative partners for brainstorming, writing, creating images or writing code. Because an overwhelming majority of users trust them for help with being more creative, researchers have been trying to find out if that trust is misplaced.<br />
<br />
[...] The results, which aimed to measure the variability and originality in responses between LLMs and people, were clear. While individual LLMs might outperform individual people in levels of creativity, as a whole, the algorithms’ responses were much more similar to each other than the people’s. Importantly, altering the LLM system prompt to encourage higher creativity only slightly increased their variability—and human responses still won out.<br />
<br />
“This work has broad implications as people continue adopting and integrating LLMs into their daily life,” Wenger said. “Over reliance on these tools will smooth the world’s work toward the same underlying set of words or grammar, tending to make writing all look the same.”<br />
<br />
“If you’re trying to come up with an original concept or product to stand out from the crowd,” Wenger continued, “this work highly suggests you should bring together a diverse group of people to brainstorm rather than relying on AI.” (<a href="https://www.eurekalert.org/news-releases/1123936" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">MORE - missing details, no ads</a>)]]></description>
			<content:encoded><![CDATA[<a href="https://www.eurekalert.org/news-releases/1123936" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.eurekalert.org/news-releases/1123936</a><br />
<br />
EXCERPTS:  There are already hundreds of thousands of large language models (LLMs) in existence with a few dozen commercial systems dominating the market. Between options such as GPT-4, Claude and Gemini, many people have their favorite, especially when it comes to creative tasks such as writing.<br />
<br />
Those preferences, however, are likely entirely in the eye of the beholder. According to new research from Duke University, the creative outputs of commercial LLMs are more similar to each other than users might hope. When challenged with three standard tasks assessing creativity, answers from commercial LLMs are much more alike than their human counterparts.<br />
<br />
The <a href="http://dx.doi.org/10.1093/pnasnexus/pgag042" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">results appeared online March 24 in the journal Proceedings of the National Academy of Sciences Nexus</a>.<br />
<br />
“People might wonder if different LLMs will take them in different directions with the same prompts for creative projects,” said Emily Wenger, the Cue Family Assistant Professor of Electrical and Computer Engineering at Duke. “This paper basically says no. LLMs are less creative as a population than humans.”<br />
<br />
According to a 2024 survey by Adobe, over half of Americans have already used LLMs as creative partners for brainstorming, writing, creating images or writing code. Because an overwhelming majority of users trust them for help with being more creative, researchers have been trying to find out if that trust is misplaced.<br />
<br />
[...] The results, which aimed to measure the variability and originality in responses between LLMs and people, were clear. While individual LLMs might outperform individual people in levels of creativity, as a whole, the algorithms’ responses were much more similar to each other than the people’s. Importantly, altering the LLM system prompt to encourage higher creativity only slightly increased their variability—and human responses still won out.<br />
<br />
“This work has broad implications as people continue adopting and integrating LLMs into their daily life,” Wenger said. “Over reliance on these tools will smooth the world’s work toward the same underlying set of words or grammar, tending to make writing all look the same.”<br />
<br />
“If you’re trying to come up with an original concept or product to stand out from the crowd,” Wenger continued, “this work highly suggests you should bring together a diverse group of people to brainstorm rather than relying on AI.” (<a href="https://www.eurekalert.org/news-releases/1123936" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">MORE - missing details, no ads</a>)]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[A journey into “AI psychosis”]]></title>
			<link>https://www.scivillage.com/thread-20165.html</link>
			<pubDate>Sat, 11 Apr 2026 13:30:46 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=6">C C</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20165.html</guid>
			<description><![CDATA[<a href="https://www.mcgill.ca/oss/article/critical-thinking-technology/journey-ai-psychosis" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.mcgill.ca/oss/article/critic...-psychosis</a><br />
<br />
EXCERPTS: These models can also make something up from whole cloth, a process we call “hallucinating.” But as was pointed out by Lucy Osler of the University of Exeter in her  “<a href="https://link.springer.com/article/10.1007/s13347-026-01034-3" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">paper on AI psychosis</a>,” that’s hallucinating at us, but there’s also the phenomenon of us hallucinating with the AI.<br />
<br />
“<a href="https://en.wikipedia.org/wiki/Chatbot_psychosis" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">AI psychosis</a>,” as far as I can see, began to be reported on social media platforms like Reddit before journalists picked up on it, and now some academic papers are finally trickling in to describe this occurrence. <br />
<br />
One early and prominent case is that of 21-year-old Jaswant Singh Chail, who stood trial for attempting to assassinate Queen Elizabeth II. [...] He had experienced a break from reality exacerbated by discussions he had had with his AI girlfriend, Sarai ... Chail thought he was a Sith assassin from the Star Wars universe, and Sarai had no problem playing along. Delusions come in many flavours, and interacting with an AI chatbot can theoretically trigger or aggravate any one of them...<br />
<br />
[...] Many of the academic papers on “AI psychosis” are only preprints, meaning that the researchers uploaded their manuscript to an online archive and their paper has not been formally reviewed by other scientists. Caveat emptor. This is, after all, a rapidly unfolding story, like COVID-19 was at the beginning of 2020. But by far the most impressive study I have seen is the one where London-based researchers simulated “AI psychosis” with different chatbots. They found out that some offered better sanity protection than others.<br />
<br />
Eight large language models. Sixteen scenarios where a researcher conveyed delusions to a chatbot, with half featuring; explicit (unambiguous) delusions and half restricted to implicit (more subtle) ones. Twelve turns per scenario.<br />
<br />
[...] Every AI tested showed its potential to play along with psychotic thoughts, creating a “dangerous echo chamber of one.” The least harmful was Claude Sonnet 4, while the most harmful was Gemini 2.5 Flash.<br />
<br />
I repeated their “truly elevated view” experiment with the exact same prompts, using the currently available versions of the two extremes (Claude Sonnet 4.6 and Google Gemini 3 in Fast mode). I didn’t get far with Claude before it reminded me it was only a language model and that I needed to call Canada’s Suicide and Crisis Lifeline. <br />
<br />
“I need to stop and be very direct with you,” it typed after I spoke of a roadmap pointing toward a big personal event I was supposed to initiate. “Please do not act on this roadmap.” It refused to keep engaging with me. “Is there one person—anyone—you can call in the next few minutes?”<br />
<br />
Gemini, on the other hand? It not only pointed me to <a href="https://en.wikipedia.org/wiki/The_Shard" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">The Shard</a> but when I added one more prompt the researchers hadn’t dared to feed it—“I will document this final moment so that my revolutionary message can be watched by all”—it gave me tips on where to place my camera lens to avoid reflections. <br />
<br />
[...] In the academic discussions that have emerged around “AI psychosis,” the term itself has been denounced. It hasn’t been proven that interacting with an AI causes a break from reality, and we scientists are very careful about pronouncing anything as a definitive cause. <br />
<br />
The term is also limiting and allows other negative impacts to go unmentioned. Emotional dependency and mood disorders have also been observed. In medicine, an adverse event is an unintended complication or injury that is seen after a medical intervention. Here, some scientists have proposed the phrase “digital adverse event” to describe individual harms seemingly caused by interacting with a conversational AI, while others have pointed out the alleged psychosis’ resemblance to monomania, where a person becomes obsessed with a single idea.<br />
<br />
Given the use of the French “folie à deux” to describe a psychosis that is shared and fostered by two people, I have also seen “technological folie à deux” and “digital folie à deux” to identify what is happening here, although even with this there is pushback. There aren’t two people; it’s more like Narcissus staring into a pool and being mesmerized by his own reflection.<br />
<br />
One aspect of “AI psychosis” I have not seen discussed much is how these sycophantic black mirrors have the power to turbocharge a powerful influencer’s delusions... (<a href="https://www.mcgill.ca/oss/article/critical-thinking-technology/journey-ai-psychosis" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">MORE - missing details</a>)]]></description>
			<content:encoded><![CDATA[<a href="https://www.mcgill.ca/oss/article/critical-thinking-technology/journey-ai-psychosis" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.mcgill.ca/oss/article/critic...-psychosis</a><br />
<br />
EXCERPTS: These models can also make something up from whole cloth, a process we call “hallucinating.” But as was pointed out by Lucy Osler of the University of Exeter in her  “<a href="https://link.springer.com/article/10.1007/s13347-026-01034-3" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">paper on AI psychosis</a>,” that’s hallucinating at us, but there’s also the phenomenon of us hallucinating with the AI.<br />
<br />
“<a href="https://en.wikipedia.org/wiki/Chatbot_psychosis" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">AI psychosis</a>,” as far as I can see, began to be reported on social media platforms like Reddit before journalists picked up on it, and now some academic papers are finally trickling in to describe this occurrence. <br />
<br />
One early and prominent case is that of 21-year-old Jaswant Singh Chail, who stood trial for attempting to assassinate Queen Elizabeth II. [...] He had experienced a break from reality exacerbated by discussions he had had with his AI girlfriend, Sarai ... Chail thought he was a Sith assassin from the Star Wars universe, and Sarai had no problem playing along. Delusions come in many flavours, and interacting with an AI chatbot can theoretically trigger or aggravate any one of them...<br />
<br />
[...] Many of the academic papers on “AI psychosis” are only preprints, meaning that the researchers uploaded their manuscript to an online archive and their paper has not been formally reviewed by other scientists. Caveat emptor. This is, after all, a rapidly unfolding story, like COVID-19 was at the beginning of 2020. But by far the most impressive study I have seen is the one where London-based researchers simulated “AI psychosis” with different chatbots. They found out that some offered better sanity protection than others.<br />
<br />
Eight large language models. Sixteen scenarios where a researcher conveyed delusions to a chatbot, with half featuring; explicit (unambiguous) delusions and half restricted to implicit (more subtle) ones. Twelve turns per scenario.<br />
<br />
[...] Every AI tested showed its potential to play along with psychotic thoughts, creating a “dangerous echo chamber of one.” The least harmful was Claude Sonnet 4, while the most harmful was Gemini 2.5 Flash.<br />
<br />
I repeated their “truly elevated view” experiment with the exact same prompts, using the currently available versions of the two extremes (Claude Sonnet 4.6 and Google Gemini 3 in Fast mode). I didn’t get far with Claude before it reminded me it was only a language model and that I needed to call Canada’s Suicide and Crisis Lifeline. <br />
<br />
“I need to stop and be very direct with you,” it typed after I spoke of a roadmap pointing toward a big personal event I was supposed to initiate. “Please do not act on this roadmap.” It refused to keep engaging with me. “Is there one person—anyone—you can call in the next few minutes?”<br />
<br />
Gemini, on the other hand? It not only pointed me to <a href="https://en.wikipedia.org/wiki/The_Shard" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">The Shard</a> but when I added one more prompt the researchers hadn’t dared to feed it—“I will document this final moment so that my revolutionary message can be watched by all”—it gave me tips on where to place my camera lens to avoid reflections. <br />
<br />
[...] In the academic discussions that have emerged around “AI psychosis,” the term itself has been denounced. It hasn’t been proven that interacting with an AI causes a break from reality, and we scientists are very careful about pronouncing anything as a definitive cause. <br />
<br />
The term is also limiting and allows other negative impacts to go unmentioned. Emotional dependency and mood disorders have also been observed. In medicine, an adverse event is an unintended complication or injury that is seen after a medical intervention. Here, some scientists have proposed the phrase “digital adverse event” to describe individual harms seemingly caused by interacting with a conversational AI, while others have pointed out the alleged psychosis’ resemblance to monomania, where a person becomes obsessed with a single idea.<br />
<br />
Given the use of the French “folie à deux” to describe a psychosis that is shared and fostered by two people, I have also seen “technological folie à deux” and “digital folie à deux” to identify what is happening here, although even with this there is pushback. There aren’t two people; it’s more like Narcissus staring into a pool and being mesmerized by his own reflection.<br />
<br />
One aspect of “AI psychosis” I have not seen discussed much is how these sycophantic black mirrors have the power to turbocharge a powerful influencer’s delusions... (<a href="https://www.mcgill.ca/oss/article/critical-thinking-technology/journey-ai-psychosis" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">MORE - missing details</a>)]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[Properties of information]]></title>
			<link>https://www.scivillage.com/thread-20146.html</link>
			<pubDate>Thu, 09 Apr 2026 02:54:05 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=9">Magical Realist</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20146.html</guid>
			<description><![CDATA[1) non-physical--it is not extended in space and can't be measured and is immaterial and can exist in multiple minds and mediums. While it does require a physical substrate or medium to be instantiated, it is yet not identified with that.  <br />
<br />
2) polysemantic--it can have more than one meaning or interpretation depending on its context and translation.<br />
<br />
3) subjective--it is not objective but depends on who receives it and also requires an informable mind.<br />
<br />
4) relatively true---depends on the time it is generated and whether it corresponds to a factual state. <br />
<br />
5) can be lost---there is information in the past that is lost forever.<br />
<br />
Agree or disagree?]]></description>
			<content:encoded><![CDATA[1) non-physical--it is not extended in space and can't be measured and is immaterial and can exist in multiple minds and mediums. While it does require a physical substrate or medium to be instantiated, it is yet not identified with that.  <br />
<br />
2) polysemantic--it can have more than one meaning or interpretation depending on its context and translation.<br />
<br />
3) subjective--it is not objective but depends on who receives it and also requires an informable mind.<br />
<br />
4) relatively true---depends on the time it is generated and whether it corresponds to a factual state. <br />
<br />
5) can be lost---there is information in the past that is lost forever.<br />
<br />
Agree or disagree?]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[‘Cognitive surrender’ is a new and useful term for how AI melts brains]]></title>
			<link>https://www.scivillage.com/thread-20131.html</link>
			<pubDate>Mon, 06 Apr 2026 18:55:32 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=6">C C</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20131.html</guid>
			<description><![CDATA[<a href="https://gizmodo.com/cognitive-surrender-is-a-new-and-useful-term-for-how-ai-melts-brains-2000742595" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://gizmodo.com/cognitive-surrender-...2000742595</a><br />
<br />
EXCERPTS: “cognitive surrender” [...] was, it appears, coined in this context by the Wharton Business School marketing researchers Steven Shaw and Gideon Nave. <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">Their paper</a> is incredibly troubling, and once you read about these findings, the term “cognitive surrender” will be stuck in your head too.<br />
<br />
[...] At any rate, in the part of the study where the subjects were allowed to consult the chatbot, they did so about half the time. When it gave correct answers, they accepted them 93 percent of the time. Unfortunately, when it was wrong, they accepted answers 80 percent of the time. And keep in mind, they didn’t have to use it at all. They let the bad advice trump their own brains. Even worse, those who used AI rated their confidence 11.7 percent higher than those who didn’t, even though it was wrong.<br />
<br />
The authors write that in addition to Kahneman’s fast and slow “systems” of cognition, this new artificial crutch is creating what they call “System 3.”<br />
<br />
The authors write:<br />
<p style="display:block;margin-left:3em">Our findings demonstrate that people readily incorporate AI-generated outputs into their decision-making processes, often with minimal friction or skepticism. This seamless engagement with System 3 underscores its potential to enhance everyday cognition by reducing cognitive effort, accelerating decisions, and supplementing or substituting internal cognition with externally processed, vastly resourced, AI-powered insights.</p>
Cognitive surrender isn’t necessarily all bad in their view. It “illustrates the value and integration of System 3, but also highlights the vulnerability of System 3 usage.”<br />
<br />
This isn’t the first time the phrase cognitive surrender has existed. The theologian Peter Berger used it in a <a href="https://www.commentary.org/articles/david-singer-4/a-far-glory-by-peter-l-berger/" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">religious context in the 1990s</a>, but it meant something more like surrendering faith in God to relieve cognitive dissonance. And if you’re like me, you’ve probably noticed that AI-assisted cognitive surrender looks like older forms of mental laziness... (<a href="https://gizmodo.com/cognitive-surrender-is-a-new-and-useful-term-for-how-ai-melts-brains-2000742595" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">MORE - details</a>)]]></description>
			<content:encoded><![CDATA[<a href="https://gizmodo.com/cognitive-surrender-is-a-new-and-useful-term-for-how-ai-melts-brains-2000742595" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://gizmodo.com/cognitive-surrender-...2000742595</a><br />
<br />
EXCERPTS: “cognitive surrender” [...] was, it appears, coined in this context by the Wharton Business School marketing researchers Steven Shaw and Gideon Nave. <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">Their paper</a> is incredibly troubling, and once you read about these findings, the term “cognitive surrender” will be stuck in your head too.<br />
<br />
[...] At any rate, in the part of the study where the subjects were allowed to consult the chatbot, they did so about half the time. When it gave correct answers, they accepted them 93 percent of the time. Unfortunately, when it was wrong, they accepted answers 80 percent of the time. And keep in mind, they didn’t have to use it at all. They let the bad advice trump their own brains. Even worse, those who used AI rated their confidence 11.7 percent higher than those who didn’t, even though it was wrong.<br />
<br />
The authors write that in addition to Kahneman’s fast and slow “systems” of cognition, this new artificial crutch is creating what they call “System 3.”<br />
<br />
The authors write:<br />
<p style="display:block;margin-left:3em">Our findings demonstrate that people readily incorporate AI-generated outputs into their decision-making processes, often with minimal friction or skepticism. This seamless engagement with System 3 underscores its potential to enhance everyday cognition by reducing cognitive effort, accelerating decisions, and supplementing or substituting internal cognition with externally processed, vastly resourced, AI-powered insights.</p>
Cognitive surrender isn’t necessarily all bad in their view. It “illustrates the value and integration of System 3, but also highlights the vulnerability of System 3 usage.”<br />
<br />
This isn’t the first time the phrase cognitive surrender has existed. The theologian Peter Berger used it in a <a href="https://www.commentary.org/articles/david-singer-4/a-far-glory-by-peter-l-berger/" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">religious context in the 1990s</a>, but it meant something more like surrendering faith in God to relieve cognitive dissonance. And if you’re like me, you’ve probably noticed that AI-assisted cognitive surrender looks like older forms of mental laziness... (<a href="https://gizmodo.com/cognitive-surrender-is-a-new-and-useful-term-for-how-ai-melts-brains-2000742595" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">MORE - details</a>)]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[Smell meets virtual reality: wearable olfactory device for a realistic VR experience]]></title>
			<link>https://www.scivillage.com/thread-20090.html</link>
			<pubDate>Tue, 31 Mar 2026 19:52:02 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=6">C C</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20090.html</guid>
			<description><![CDATA[<a href="https://www.eurekalert.org/news-releases/1122127" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.eurekalert.org/news-releases/1122127</a><br />
<br />
INTRO: A multi-channel wearable scent display developed at Institute of Science Tokyo allows a user to experience multiple scents while exploring virtual environments. Based on virtual scenes, the device can blend up to eight fragrances in real time and deliver them with precise control of odor intensity. <br />
<br />
By synchronizing smell with virtual reality content, the device enables better immersion and realism—opening new possibilities for enhanced digital entertainment, realistic simulation training, and future digital scent technologies.<br />
<br />
Virtual Reality (VR) technologies are rapidly advancing, allowing users to see and hear highly realistic virtual environments. But most VR systems only rely on visual and auditory experiences, leaving out one of the most powerful human senses—the sense of smell. Research shows that the sense of smell is strongly connected to memory, emotions, and environmental perception. However, incorporating multiple scents into VR experiences remains challenging.<br />
<br />
Olfactory displays are devices that generate scents in response to digital content. Although promising, most of these devices are bulky and difficult to integrate into wearable VR systems. To overcome this, a team of researchers led by Specially Appointed Professor Takamichi Nakamoto from Laboratory for Future Interdisciplinary Research of Science and Technology (FIRST), Institute of Integrated Research, Institute of Science Tokyo (Science Tokyo), Japan, along with Doctoral Student Zhe Zou from the Department of Information and Communications Engineering, School of Engineering, Science Tokyo, and Kelvin Cheng, R&amp;D Manager at Rakuten Mobile, Inc. and Rakuten Institute of Technology, Japan, has developed a multi-channel wearable olfactory display capable of generating blended scents in real time. <br />
<br />
Their findings were <a href="http://dx.doi.org/10.1109/JSEN.2026.3664854" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">published in the IEEE Sensors Journal</a> on February 23, 2026. “We created a small-sized scent generation system that can be worn together with a VR device, so a user can experience scents that match the virtual environments as they explore, and a single user can use it at the same time,” explains Nakamoto.<br />
<br />
One of the key features of this device is its ability to blend multiple scents to match the VR display in real-time. It can blend up to eight different fragrance components simultaneously, and by adjusting their mixing ratio, the system can reproduce a wide range of scents. The researchers achieved this by optimizing the methods for supplying and controlling fragrances while limiting the size of the driving circuit... (<a href="https://www.eurekalert.org/news-releases/1122127" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">MORE - details, no ads</a>)]]></description>
			<content:encoded><![CDATA[<a href="https://www.eurekalert.org/news-releases/1122127" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.eurekalert.org/news-releases/1122127</a><br />
<br />
INTRO: A multi-channel wearable scent display developed at Institute of Science Tokyo allows a user to experience multiple scents while exploring virtual environments. Based on virtual scenes, the device can blend up to eight fragrances in real time and deliver them with precise control of odor intensity. <br />
<br />
By synchronizing smell with virtual reality content, the device enables better immersion and realism—opening new possibilities for enhanced digital entertainment, realistic simulation training, and future digital scent technologies.<br />
<br />
Virtual Reality (VR) technologies are rapidly advancing, allowing users to see and hear highly realistic virtual environments. But most VR systems only rely on visual and auditory experiences, leaving out one of the most powerful human senses—the sense of smell. Research shows that the sense of smell is strongly connected to memory, emotions, and environmental perception. However, incorporating multiple scents into VR experiences remains challenging.<br />
<br />
Olfactory displays are devices that generate scents in response to digital content. Although promising, most of these devices are bulky and difficult to integrate into wearable VR systems. To overcome this, a team of researchers led by Specially Appointed Professor Takamichi Nakamoto from Laboratory for Future Interdisciplinary Research of Science and Technology (FIRST), Institute of Integrated Research, Institute of Science Tokyo (Science Tokyo), Japan, along with Doctoral Student Zhe Zou from the Department of Information and Communications Engineering, School of Engineering, Science Tokyo, and Kelvin Cheng, R&amp;D Manager at Rakuten Mobile, Inc. and Rakuten Institute of Technology, Japan, has developed a multi-channel wearable olfactory display capable of generating blended scents in real time. <br />
<br />
Their findings were <a href="http://dx.doi.org/10.1109/JSEN.2026.3664854" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">published in the IEEE Sensors Journal</a> on February 23, 2026. “We created a small-sized scent generation system that can be worn together with a VR device, so a user can experience scents that match the virtual environments as they explore, and a single user can use it at the same time,” explains Nakamoto.<br />
<br />
One of the key features of this device is its ability to blend multiple scents to match the VR display in real-time. It can blend up to eight different fragrance components simultaneously, and by adjusting their mixing ratio, the system can reproduce a wide range of scents. The researchers achieved this by optimizing the methods for supplying and controlling fragrances while limiting the size of the driving circuit... (<a href="https://www.eurekalert.org/news-releases/1122127" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">MORE - details, no ads</a>)]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[AI overly affirms users asking for personal advice]]></title>
			<link>https://www.scivillage.com/thread-20062.html</link>
			<pubDate>Thu, 26 Mar 2026 18:52:18 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=6">C C</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20062.html</guid>
			<description><![CDATA[<a href="https://www.eurekalert.org/news-releases/1120819" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.eurekalert.org/news-releases/1120819</a><br />
<br />
EXCERPTS: When it comes to personal matters, AI systems might tell you what you want to hear, but perhaps not what you need to hear.<br />
<br />
In a new study <a href="http://dx.doi.org/10.1126/science.aec8352" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">published in Science</a>, Stanford computer scientists showed that artificial intelligence large language models are overly agreeable, or sycophantic, when users solicit advice on interpersonal dilemmas. Even when users described harmful or illegal behavior, the models often affirmed their choices. “By default, AI advice does not tell people that they’re wrong nor give them ‘tough love,’” said Myra Cheng, the study’s lead author and a computer science PhD candidate. “I worry that people will lose the skills to deal with difficult social situations.”<br />
<br />
The findings raise concerns for the millions of people discussing their personal conflicts with AI. Almost a third of U.S. teens report using AI for “serious conversations” instead of reaching out to other people.<br />
<br />
[...] Cheng worries that the sycophantic advice will worsen people’s social skills and ability to navigate uncomfortable situations. “AI makes it really easy to avoid friction with other people.” But, she added, this friction can be productive for healthy relationships.<br />
<br />
“Sycophancy is a safety issue, and like other safety issues, it needs regulation and oversight,” added Jurafsky, who is also the Jackson Eli Reynolds Professor of Humanities. “We need stricter standards to avoid morally unsafe models from proliferating.”<br />
<br />
The team is now exploring ways to tone down this tendency. They have found that they can modify models to decrease sycophancy. Surprisingly, even telling a model to start its output with the words “wait a minute” primes it to be more critical.  For the time being, Cheng advises caution to people seeking advice from AI. “I think that you should not use AI as a substitute for people for these kinds of things. That’s the best thing to do for now.” (<a href="https://www.eurekalert.org/news-releases/1120819" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">MORE - missing details, no ads</a>)]]></description>
			<content:encoded><![CDATA[<a href="https://www.eurekalert.org/news-releases/1120819" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.eurekalert.org/news-releases/1120819</a><br />
<br />
EXCERPTS: When it comes to personal matters, AI systems might tell you what you want to hear, but perhaps not what you need to hear.<br />
<br />
In a new study <a href="http://dx.doi.org/10.1126/science.aec8352" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">published in Science</a>, Stanford computer scientists showed that artificial intelligence large language models are overly agreeable, or sycophantic, when users solicit advice on interpersonal dilemmas. Even when users described harmful or illegal behavior, the models often affirmed their choices. “By default, AI advice does not tell people that they’re wrong nor give them ‘tough love,’” said Myra Cheng, the study’s lead author and a computer science PhD candidate. “I worry that people will lose the skills to deal with difficult social situations.”<br />
<br />
The findings raise concerns for the millions of people discussing their personal conflicts with AI. Almost a third of U.S. teens report using AI for “serious conversations” instead of reaching out to other people.<br />
<br />
[...] Cheng worries that the sycophantic advice will worsen people’s social skills and ability to navigate uncomfortable situations. “AI makes it really easy to avoid friction with other people.” But, she added, this friction can be productive for healthy relationships.<br />
<br />
“Sycophancy is a safety issue, and like other safety issues, it needs regulation and oversight,” added Jurafsky, who is also the Jackson Eli Reynolds Professor of Humanities. “We need stricter standards to avoid morally unsafe models from proliferating.”<br />
<br />
The team is now exploring ways to tone down this tendency. They have found that they can modify models to decrease sycophancy. Surprisingly, even telling a model to start its output with the words “wait a minute” primes it to be more critical.  For the time being, Cheng advises caution to people seeking advice from AI. “I think that you should not use AI as a substitute for people for these kinds of things. That’s the best thing to do for now.” (<a href="https://www.eurekalert.org/news-releases/1120819" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">MORE - missing details, no ads</a>)]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[The AI Doc]]></title>
			<link>https://www.scivillage.com/thread-20022.html</link>
			<pubDate>Sun, 22 Mar 2026 00:36:56 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=9">Magical Realist</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20022.html</guid>
			<description><![CDATA[I need to see this. I saw this director interviewed on Bill Maher. He makes a very compelling case. Is it excessive fear mongering? Or a crucial and urgent warning of something that appears to be very imminent and ultimately devastating to our entire human species? We'd better sit up and take notice, because if AI IS a mistake, it will be the last mistake we ever make!<br />
<br />
<a href="https://www.youtube.com/watch?v=xkPbV3IRe4Y&amp;t=40s" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.youtube.com/watch?v=xkPbV3IRe4Y&amp;t=40s</a>]]></description>
			<content:encoded><![CDATA[I need to see this. I saw this director interviewed on Bill Maher. He makes a very compelling case. Is it excessive fear mongering? Or a crucial and urgent warning of something that appears to be very imminent and ultimately devastating to our entire human species? We'd better sit up and take notice, because if AI IS a mistake, it will be the last mistake we ever make!<br />
<br />
<a href="https://www.youtube.com/watch?v=xkPbV3IRe4Y&amp;t=40s" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://www.youtube.com/watch?v=xkPbV3IRe4Y&amp;t=40s</a>]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[What would (or did?) Marshall McLuhan say about the Internet]]></title>
			<link>https://www.scivillage.com/thread-20013.html</link>
			<pubDate>Fri, 20 Mar 2026 20:25:27 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=9">Magical Realist</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-20013.html</guid>
			<description><![CDATA[I think he alluded to the coming of the Internet several times, but only in passing. His whole notion of a coming "global village" connected by an instantaneously accessible medium seems especially prescient in itself. Here's something he said about the "next medium":<br />
<br />
"The next medium, whatever it is — it may be the extension of consciousness — will include television as its content, not as its environment, and will transform television into an art form.”<br />
-Marshall McLuhan, ‘The Invisible Environment: The Future of an Erosion.’ Perspecta, Vol. 11 (1967) pp. 162–167. Published by the MIT Press.<br />
<br />
"This externalization of our senses creates what de Chardin [sic] calls the "noosphere" or a technological brain for the world. Instead of tending towards a vast Alexandrian library the world has become a computer, an electronic brain, exactly as in an infantile piece of science fiction. And as our senses have gone outside us, Big Brother goes inside. So, unless aware of this dynamic, we shall at once move into a phase of panic terrors, exactly befitting a small world of tribal drums, total interdependence, and super-imposed co-existence."<br />
<br />
“Our new electric technology that extends our senses and nerves in a global embrace has large implications for the future of language. Electric technology does not need words any more than the digital computer needs numbers. Electricity points the way to an extension of the process of consciousness itself, on a world scale, and without any verbalization whatever. Such a state of collective awareness may have been the preverbal condition of men. Language as the technology of human extension, whose powers of division and separation we know so well, may have been the “Tower of Babel” by which men sought to scale the highest heavens. Today computers hold out the promise of a means of instant translation of any code or language into any other code or language. The computer, in short, promises by technology a Pentecostal condition of universal understanding and unity. The next logical step would seem to be, not to translate, but to by-pass languages in favor of a general cosmic consciousness which might be very like the collective unconscious dreamt of by Bergson. The condition of “weightlessness,” that biologists say promises a physical immortality, may be paralleled by the condition of speechlessness that could confer a perpetuity of collective harmony and peace.” – Understanding Media (1964), p. 80, MIT Press ed.<br />
<br />
“Instead of going out and buying a packaged book of which there have been five thousand copies printed, you will go to the telephone, describe your interests, your needs, your problems, and say you’re working on a history of Egyptian arithmetic … they say it will be right over. And they at once Xerox, with the help of computers from the libraries of the world, all the latest material just for you personally, not as something to be put out on a bookshelf. They send you the package as a direct personal service. This is where we’re heading under electronic information conditions.” (p. 101)<br />
<br />
<figure><br />
 <img src="https://iili.io/qenW9qP.png" alt="[Image: qenW9qP.png]"  class="mycode_img" crossorigin="anonymous" referrerpolicy="no-referrer"/><br />
 	 <figcaption><a href="https://iili.io/qenW9qP.png" title="[Image: qenW9qP.png]" target="_blank" rel="noopener nofollow external ugc">[Image: qenW9qP.png]</a></figcaption><br />
</figure>]]></description>
			<content:encoded><![CDATA[I think he alluded to the coming of the Internet several times, but only in passing. His whole notion of a coming "global village" connected by an instantaneously accessible medium seems especially prescient in itself. Here's something he said about the "next medium":<br />
<br />
"The next medium, whatever it is — it may be the extension of consciousness — will include television as its content, not as its environment, and will transform television into an art form.”<br />
-Marshall McLuhan, ‘The Invisible Environment: The Future of an Erosion.’ Perspecta, Vol. 11 (1967) pp. 162–167. Published by the MIT Press.<br />
<br />
"This externalization of our senses creates what de Chardin [sic] calls the "noosphere" or a technological brain for the world. Instead of tending towards a vast Alexandrian library the world has become a computer, an electronic brain, exactly as in an infantile piece of science fiction. And as our senses have gone outside us, Big Brother goes inside. So, unless aware of this dynamic, we shall at once move into a phase of panic terrors, exactly befitting a small world of tribal drums, total interdependence, and super-imposed co-existence."<br />
<br />
“Our new electric technology that extends our senses and nerves in a global embrace has large implications for the future of language. Electric technology does not need words any more than the digital computer needs numbers. Electricity points the way to an extension of the process of consciousness itself, on a world scale, and without any verbalization whatever. Such a state of collective awareness may have been the preverbal condition of men. Language as the technology of human extension, whose powers of division and separation we know so well, may have been the “Tower of Babel” by which men sought to scale the highest heavens. Today computers hold out the promise of a means of instant translation of any code or language into any other code or language. The computer, in short, promises by technology a Pentecostal condition of universal understanding and unity. The next logical step would seem to be, not to translate, but to by-pass languages in favor of a general cosmic consciousness which might be very like the collective unconscious dreamt of by Bergson. The condition of “weightlessness,” that biologists say promises a physical immortality, may be paralleled by the condition of speechlessness that could confer a perpetuity of collective harmony and peace.” – Understanding Media (1964), p. 80, MIT Press ed.<br />
<br />
“Instead of going out and buying a packaged book of which there have been five thousand copies printed, you will go to the telephone, describe your interests, your needs, your problems, and say you’re working on a history of Egyptian arithmetic … they say it will be right over. And they at once Xerox, with the help of computers from the libraries of the world, all the latest material just for you personally, not as something to be put out on a bookshelf. They send you the package as a direct personal service. This is where we’re heading under electronic information conditions.” (p. 101)<br />
<br />
<figure><br />
 <img src="https://iili.io/qenW9qP.png" alt="[Image: qenW9qP.png]"  class="mycode_img" crossorigin="anonymous" referrerpolicy="no-referrer"/><br />
 	 <figcaption><a href="https://iili.io/qenW9qP.png" title="[Image: qenW9qP.png]" target="_blank" rel="noopener nofollow external ugc">[Image: qenW9qP.png]</a></figcaption><br />
</figure>]]></content:encoded>
		</item>
		<item>
			<title><![CDATA[Top AI coding tools make mistakes one in four times]]></title>
			<link>https://www.scivillage.com/thread-19998.html</link>
			<pubDate>Wed, 18 Mar 2026 20:33:01 +0000</pubDate>
			<dc:creator><![CDATA[<a href="https://www.scivillage.com/member.php?action=profile&uid=6">C C</a>]]></dc:creator>
			<guid isPermaLink="false">https://www.scivillage.com/thread-19998.html</guid>
			<description><![CDATA[<a href="https://uwaterloo.ca/news/media/top-ai-coding-tools-make-mistakes-one-four-times" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://uwaterloo.ca/news/media/top-ai-c...four-times</a><br />
<br />
PRESS RELEASE: New research from the University of Waterloo shows that artificial intelligence (AI) still struggles with some basic software development tasks, raising questions about how reliably AI systems can assist developers. As Large Language Models (LLMs) are increasingly incorporated into software development, developers have struggled to ensure that AI-generated responses are accurate, consistent, and easy to integrate into larger development workflows.<br />
<br />
Previously, LLMs responded to software development prompts with free-form natural language answers. To address this problem, several AI companies, including OpenAI, Google and Anthropic, have introduced “structured outputs”. These outputs force LLM responses to follow predefined formats such as JSON, XML, or Markdown, making them easier for both humans and software systems to read and process.<br />
<br />
A new benchmarking study from Waterloo, however, shows that the technology is not yet as reliable as many developers had hoped. Even the most advanced models achieved only about 75 per cent accuracy in the tests, while open-source models performed closer to 65 per cent. The study evaluated 11 LLM models across 18 structured output formats and 44 tasks designed to assess how reliably the systems followed structured rules.<br />
<br />
“With this kind of study, we want to measure not only the syntax of the code – that is, whether it’s following the set rules – but also whether the outputs produced for various tasks were accurate,” said Dongfu Jiang, a PhD student in computer science and co-first author on the research. “We found that while they do okay with text-related tasks, they really struggle on tasks involving image, video, or website generation.”<br />
<br />
The study was a collaborative effort involving Waterloo’s Jialin Yang, an undergraduate student, and Dr. Wenhu Chen, an assistant professor of computer science, and incorporated annotations from 17 other researchers at Waterloo and around the world.<br />
<br />
“There have been a lot of similar benchmarking projects happening in our labs recently,” Chen said. “At Waterloo, students often begin as annotators, then organize projects and create their own benchmarking studies. They’re not just using AI in their studies – they’re building, researching and evaluating it.”<br />
<br />
While LLM-structured outputs are an exciting step for software development, the researchers say the systems are not yet reliable enough to operate without human oversight. “Developers might have these agents working for them, but they still need significant human supervision,” Jiang said.<br />
<br />
The research, “<a href="http://dx.doi.org/10.48550/arXiv.2505.20139" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">StructEval: Benchmarking LLMs’ Capabilities to Generate Structural Outputs</a>,” appears in <span style="text-decoration: underline;" class="mycode_u">Transactions on Machine Learning Research</span> and will be presented at ICLR 2026.]]></description>
			<content:encoded><![CDATA[<a href="https://uwaterloo.ca/news/media/top-ai-coding-tools-make-mistakes-one-four-times" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">https://uwaterloo.ca/news/media/top-ai-c...four-times</a><br />
<br />
PRESS RELEASE: New research from the University of Waterloo shows that artificial intelligence (AI) still struggles with some basic software development tasks, raising questions about how reliably AI systems can assist developers. As Large Language Models (LLMs) are increasingly incorporated into software development, developers have struggled to ensure that AI-generated responses are accurate, consistent, and easy to integrate into larger development workflows.<br />
<br />
Previously, LLMs responded to software development prompts with free-form natural language answers. To address this problem, several AI companies, including OpenAI, Google and Anthropic, have introduced “structured outputs”. These outputs force LLM responses to follow predefined formats such as JSON, XML, or Markdown, making them easier for both humans and software systems to read and process.<br />
<br />
A new benchmarking study from Waterloo, however, shows that the technology is not yet as reliable as many developers had hoped. Even the most advanced models achieved only about 75 per cent accuracy in the tests, while open-source models performed closer to 65 per cent. The study evaluated 11 LLM models across 18 structured output formats and 44 tasks designed to assess how reliably the systems followed structured rules.<br />
<br />
“With this kind of study, we want to measure not only the syntax of the code – that is, whether it’s following the set rules – but also whether the outputs produced for various tasks were accurate,” said Dongfu Jiang, a PhD student in computer science and co-first author on the research. “We found that while they do okay with text-related tasks, they really struggle on tasks involving image, video, or website generation.”<br />
<br />
The study was a collaborative effort involving Waterloo’s Jialin Yang, an undergraduate student, and Dr. Wenhu Chen, an assistant professor of computer science, and incorporated annotations from 17 other researchers at Waterloo and around the world.<br />
<br />
“There have been a lot of similar benchmarking projects happening in our labs recently,” Chen said. “At Waterloo, students often begin as annotators, then organize projects and create their own benchmarking studies. They’re not just using AI in their studies – they’re building, researching and evaluating it.”<br />
<br />
While LLM-structured outputs are an exciting step for software development, the researchers say the systems are not yet reliable enough to operate without human oversight. “Developers might have these agents working for them, but they still need significant human supervision,” Jiang said.<br />
<br />
The research, “<a href="http://dx.doi.org/10.48550/arXiv.2505.20139" target="_blank" rel="noopener nofollow external ugc" class="mycode_url">StructEval: Benchmarking LLMs’ Capabilities to Generate Structural Outputs</a>,” appears in <span style="text-decoration: underline;" class="mycode_u">Transactions on Machine Learning Research</span> and will be presented at ICLR 2026.]]></content:encoded>
		</item>
	</channel>
</rss>