Are we witnessing the dawn of post-theory science?

C C Offline

EXCERPTS: . . . machine learning tools predict your preferences [...] You can’t lift a curtain and peer into the mechanism. They offer up no explanation, no set of rules for converting this into that – no theory, in a word. They just work and do so well.

[...] In 2008, Chris Anderson, the then editor-in-chief of Wired magazine, predicted the demise of theory. So much data had accumulated, he argued, and computers were already so much better than us at finding relationships within it, that our theories were being exposed for what they were – oversimplifications of reality.

Soon, the old scientific method – hypothesise, predict, test – would be relegated to the dustbin of history. We’d stop looking for the causes of things and be satisfied with correlations.

With the benefit of hindsight, we can say that what Anderson saw is true (he wasn’t alone). The complexity that this wealth of data has revealed to us cannot be captured by theory as traditionally understood. “We have leapfrogged over our ability to even write the theories that are going to be useful for description,” says computational neuroscientist Peter Dayan, director of the Max Planck Institute for Biological Cybernetics in Tübingen, Germany. “We don’t even know what they would look like.”

But Anderson’s prediction of the end of theory looks to have been premature – or maybe his thesis was itself an oversimplification. There are several reasons why theory refuses to die, despite the successes of such theory-free prediction engines as Facebook and AlphaFold. All are illuminating, because they force us to ask: what’s the best way to acquire knowledge and where does science go from here?

The first reason is that we’ve realised that artificial intelligences (AIs), particularly a form of machine learning called neural networks, which learn from data without having to be fed explicit instructions, are themselves fallible. Think of the prejudice that has been documented in Google’s search engines and Amazon’s hiring tools.

The second is that humans turn out to be deeply uncomfortable with theory-free science. We don’t like dealing with a black box – we want to know why.

And third, there may still be plenty of theory of the traditional kind – that is, graspable by humans – that usefully explains much but has yet to be uncovered.

So theory isn’t dead, yet, but it is changing – perhaps beyond recognition. “The theories that make sense when you have huge amounts of data look quite different from those that make sense when you have small amounts,” says Tom Griffiths, a psychologist at Princeton University.

Griffiths has been using neural nets to help him improve on existing theories in his domain, which is human decision-making. [...] They found that prospect theory did pretty well, but the neural net showed its worth in highlighting where the theory broke down, that is, where its predictions failed.

These counter-examples were highly informative, Griffiths says, because they revealed more of the complexity that exists in real life. [...] “We’re basically using the machine learning system to identify those cases where we’re seeing something that’s inconsistent with our theory,” Griffiths says. The bigger the dataset, the more inconsistencies the AI learns.

The end result is not a theory in the traditional sense of a precise claim about how people make decisions, but a set of claims that is subject to certain constraints. A way to picture it might be as a branching tree of “if… then”-type rules, which is difficult to describe mathematically, let alone in words.

What the Princeton psychologists are discovering is still just about explainable, by extension from existing theories. But as they reveal more and more complexity, it will become less so – the logical culmination of that process being the theory-free predictive engines embodied by Facebook or AlphaFold... (MORE - missing details)
C C Offline
Physics and the machine-learning “black box”

RELEASE: Machine-learning algorithms are often referred to as a “black box.” Once data are put into an algorithm, it’s not always known exactly how the algorithm arrives at its prediction. This can be particularly frustrating when things go wrong. A new mechanical engineering (MechE) course at MIT teaches students how to tackle the “black box” problem, through a combination of data science and physics-based engineering.

In class 2.C161 (Physical Systems Modeling and Design Using Machine Learning), Professor George Barbastathis demonstrates how mechanical engineers can use their unique knowledge of physical systems to keep algorithms in check and develop more accurate predictions.

“I wanted to take 2.C161 because machine-learning models are usually a “black box,” but this class taught us how to construct a system model that is informed by physics so we can peek inside,” explains Crystal Owens, a mechanical engineering graduate student who took the course in spring 2021.

As chair of the Committee on the Strategic Integration of Data Science into Mechanical Engineering, Barbastathis has had many conversations with mechanical engineering students, researchers, and faculty to better understand the challenges and successes they’ve had using machine learning in their work.

“One comment we heard frequently was that these colleagues can see the value of data science methods for problems they are facing in their mechanical engineering-centric research; yet they are lacking the tools to make the most out of it,” says Barbastathis. “Mechanical, civil, electrical, and other types of engineers want a fundamental understanding of data principles without having to convert themselves to being full-time data scientists or AI researchers.”

Additionally, as mechanical engineering students move on from MIT to their careers, many will need to manage data scientists on their teams someday. Barbastathis hopes to set these students up for success with class 2.C161.

Bridging MechE and the MIT Schwartzman College of Computing. Class 2.C161 is part of the MIT Schwartzman College of Computing “Computing Core.” The goal of these classes is to connect data science and physics-based engineering disciplines, like mechanical engineering. Students take the course alongside 6.C402 (Modeling with Machine Learning: from Algorithms to Applications), taught by professors of electrical engineering and computer science Regina Barzilay and Tommi Jaakkola.

The two classes are taught concurrently during the semester, exposing students to both fundamentals in machine learning and domain-specific applications in mechanical engineering.

In 2.C161, Barbastathis highlights how complementary physics-based engineering and data science are. Physical laws present a number of ambiguities and unknowns, ranging from temperature and humidity to electromagnetic forces. Data science can be used to predict these physical phenomena. Meanwhile, having an understanding of physical systems helps ensure the resulting output of an algorithm is accurate and explainable.

“What’s needed is a deeper combined understanding of the associated physical phenomena and the principles of data science, machine learning in particular, to close the gap,” adds Barbastathis. “By combining data with physical principles, the new revolution in physics-based engineering is relatively immune to the “black box” problem facing other types of machine learning.”

Equipped with a working knowledge of machine-learning topics covered in class 6.C402 and a deeper understanding of how to pair data science with physics, students are charged with developing a final project that solves for an actual physical system.

Developing solutions for real-world physical systems. For their final project, students in 2.C161 are asked to identify a real-world problem that requires data science to address the ambiguity inherent in physical systems. After obtaining all relevant data, students are asked to select a machine-learning method, implement their chosen solution, and present and critique the results.

Topics this past semester ranged from weather forecasting to the flow of gas in combustion engines, with two student teams drawing inspiration from the ongoing Covid-19 pandemic.

Owens and her teammates, fellow graduate students Arun Krishnadas and Joshua David John Rathinaraj, set out to develop a model for the Covid-19 vaccine rollout.

“We developed a method of combining a neural network with a susceptible-infected-recovered (SIR) epidemiological model to create a physics-informed prediction system for the spread of Covid-19 after vaccinations started,” explains Owens.

The team accounted for various unknowns including population mobility, weather, and political climate. This combined approach resulted in a prediction of Covid-19’s spread during the vaccine rollout that was more reliable than using either the SIR model or a neural network alone.

Another team, including graduate student Yiwen Hu, developed a model to predict mutation rates in Covid-19, a topic that became all too pertinent as the delta variant began its global spread.

“We used machine learning to predict the time-series-based mutation rate of Covid-19, and then incorporated that as an independent parameter into the prediction of pandemic dynamics to see if it could help us better predict the trend of the Covid-19 pandemic,” says Hu.

Hu, who had previously conducted research into how vibrations on coronavirus protein spikes affect infection rates, hopes to apply the physics-based machine-learning approaches he learned in 2.C161 to his research on de novo protein design.

Whatever the physical system students addressed in their final projects, Barbastathis was careful to stress one unifying goal: the need to assess ethical implications in data science. While more traditional computing methods like face or voice recognition have proven to be rife with ethical issues, there is an opportunity to combine physical systems with machine learning in a fair, ethical way.

“We must ensure that collection and use of data are carried out equitably and inclusively, respecting the diversity in our society and avoiding well-known problems that computer scientists in the past have run into,” says Barbastathis.

Barbastathis hopes that by encouraging mechanical engineering students to be both ethics-literate and well-versed in data science, they can move on to develop reliable, ethically sound solutions and predictions for physical-based engineering challenges.

Possibly Related Threads…
Thread Author Replies Views Last Post
  How did the chaos of chemicals become ordered biology at the dawn of life on Earth? C C 0 74 Mar 3, 2021 12:12 AM
Last Post: C C
  Post-medieval 'vampires' were likely local C C 0 614 Dec 3, 2014 06:09 AM
Last Post: C C

Users browsing this thread: 1 Guest(s)