How bodies get smarts + New link to an old model could crack mystery of deep learning

#1
C C Offline
How bodies get smarts: Simulating the evolution of embodied intelligence
https://scienceblog.com/525899/how-bodie...elligence/

INTRO: Animals have embodied smarts: They perform tasks that their bodies are well designed for. That’s because the intelligence of every animal species evolved in tandem with its physical form as it interacted with its environment. Thus, spiders weave webs with their spindly legs, beavers slap their broad tails to sound an alarm, cheetahs run fast to catch zebras, and humans have opposable thumbs for grasping tools.

Artificial intelligence is quite smart as well. But unlike animal smarts, AI is often disembodied. Natural language processing and other types of machine learning, for example, are typically done on silicon chips inside computers, with no physical manifestation in the world. And while computer vision requires cameras or sensors, it usually does so independently of any physical form.

A team of researchers at Stanford wondered: Does embodiment matter for the evolution of intelligence? And if so, how might computer scientists make use of embodiment to create smarter AIs?

To answer these questions, they created a computer-simulated playground where arthropod-like agents dubbed “unimals” (short for universal animals and pronounced “yoo-nimals”) learn and are subjected to mutations and natural selection. The researchers then studied how having virtual bodies affected the evolution of the unimals’ intelligence.

“We’re often so focused on intelligence being a function of the human brain and of neurons specifically,” says Fei-Fei Li, a member of the research team and co-director of the Stanford Institute for Human-Centered AI (HAI). “Viewing intelligence as something that is physically embodied is a different paradigm.”

Their findings, detailed in the journal Nature Communications, suggest embodiment is key to the evolution of intelligence: The virtual creatures’ body shapes affected their ability to learn new tasks, and morphologies that learn and evolve in more challenging environments, or while performing more complex tasks, learn faster and better than those that learn and evolve in simpler surroundings. In the study, unimals with the most successful morphologies also picked up tasks faster than previous generations — even though they began their existence with the same level of baseline intelligence as those that came before... (MORE)


A new link to an old model could crack the mystery of deep learning
https://www.quantamagazine.org/a-new-lin...-20211011/

INTRO: In the machine learning world, the sizes of artificial neural networks — and their outsize successes — are creating conceptual conundrums. When a network named AlexNet won an annual image recognition competition in 2012, it had about 60 million parameters. These parameters, fine-tuned during training, allowed AlexNet to recognize images that it had never seen before. Two years later, a network named VGG wowed the competition with more than 130 million such parameters. Some artificial neural networks, or ANNs, now have billions of parameters.

These massive networks — astoundingly successful at tasks such as classifying images, recognizing speech and translating text from one language to another — have begun to dominate machine learning and artificial intelligence. Yet they remain enigmatic. The reason behind their amazing power remains elusive.

But a number of researchers are showing that idealized versions of these powerful networks are mathematically equivalent to older, simpler machine learning models called kernel machines. If this equivalence can be extended beyond idealized neural networks, it may explain how practical ANNs achieve their astonishing results.

Part of the mystique of artificial neural networks is that they seem to subvert traditional machine learning theory, which leans heavily on ideas from statistics and probability theory. In the usual way of thinking, machine learning models — including neural networks, trained to learn about patterns in sample data in order to make predictions about new data — work best when they have just the right number of parameters.

If the parameters are too few, the learned model can be too simple and fail to capture all the nuances of the data it’s trained on. Too many and the model becomes overly complex, learning the patterns in the training data with such fine granularity that it cannot generalize when asked to classify new data, a phenomenon called overfitting. “It’s a balance between somehow fitting your data too well and not fitting it well at all. You want to be in the middle,” said Mikhail Belkin, a machine learning researcher at the University of California, San Diego.

By all accounts, deep neural networks like VGG have way too many parameters and should overfit. But they don’t. Instead, such networks generalize astoundingly well to new data — and until recently, no one knew why. It wasn’t for lack of trying. For example, Naftali Tishby, a computer scientist and neuroscientist at the Hebrew University of Jerusalem who died in August, argued that deep neural networks first fit the training data and then discard irrelevant information (by going through an information bottleneck), which helps them generalize. But others have argued that this doesn’t happen in all types of deep neural networks, and the idea remains controversial.

Now, the mathematical equivalence of kernel machines and idealized neural networks is providing clues to why or how these over-parameterized networks arrive at (or converge to) their solutions. Kernel machines are algorithms that find patterns in data by projecting the data into extremely high dimensions. By studying the mathematically tractable kernel equivalents of idealized neural networks, researchers are learning why deep nets, despite their shocking complexity, converge during training to solutions that generalize well to unseen data.

“A neural network is a little bit like a Rube Goldberg machine. You don’t know which part of it is really important,” said Belkin. “I think reducing [them] to kernel methods — because kernel methods don’t have all this complexity — somehow allows us to isolate the engine of what’s going on.” (MORE)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Breakthrough for quantum computers? Engineers crack 58-year-old challenge C C 0 379 Mar 15, 2020 07:27 PM
Last Post: C C
  New Theory Cracks Open the Black Box of Deep Learning C C 0 255 Sep 22, 2017 10:38 PM
Last Post: C C
  How Will Deep Learning Advance Society? C C 1 320 Jan 1, 2017 07:28 PM
Last Post: Magical Realist



Users browsing this thread: 1 Guest(s)