The Book of Why: The New Science of Cause and Effect
May 26, 2018 4:56 AM   Subscribe

To Build Truly Intelligent Machines, Teach Them Cause and Effect (Quanta) - "Judea Pearl, a pioneering figure in artificial intelligence, argues that AI has been stuck in a decades-long rut. His prescription for progress? Teach machines to understand the question why."
In his new book, Pearl, now 81, elaborates a vision for how truly intelligent machines would think. The key, he argues, is to replace reasoning by association with causal reasoning. Instead of the mere ability to correlate fever and malaria, machines need the capacity to reason that malaria causes fever. Once this kind of causal framework is in place, it becomes possible for machines to ask counterfactual questions—to inquire how the causal relationships would change given some kind of intervention—which Pearl views as the cornerstone of scientific thought. Pearl also proposes a formal language in which to make this kind of thinking possible—a 21st-century version of the Bayesian framework that allowed machines to think probabilistically.
posted by kliuless (50 comments total) 22 users marked this as a favorite
 
I have Pearl’s book on causality somewhere (I hope, after a move). I can’t wait for more mathematically literate MeFites to tear into it here.
posted by schadenfrau at 5:03 AM on May 26 [1 favorite]


My shallow, witty hot take is that it's impossible to make the surveillance capitalism understand cause-and-effect while its income depends on not understanding or even subverting it.
posted by runcifex at 5:09 AM on May 26 [10 favorites]


His prescription for progress? Teach machines to understand the question why."

~ Envisions AIs tirelessly emulating toddlers in "why?" mode...why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why?...
posted by Thorzdad at 5:17 AM on May 26 [10 favorites]


Intelligence unbound to a love for sensing beings. We can't even consider teaching them compassion because we don't even consider having compassion for the entities we might create if we truly strive to create intelligent sensing entities. Personally I believe sensation is an emergent quality among atoms, not something special that appears by magic only if neurons are present. Just because a sensation isn't communicated to the brain doesn't mean it's feelings aren't themselves real.

Not only are we trying (whether possible or not) to create an entity with human and above cognitive abilities but now we want to give them the capacity to experience existential crisis?
posted by xarnop at 5:24 AM on May 26 [6 favorites]


So soon our machine overlords will all be identification nazis?
posted by GCU Sweet and Full of Grace at 5:27 AM on May 26


Right about the rut, very probably wrong about cause and effect, well to some degree right but certainly not entirely but only as one element of many elements. What full AI really needs is for computers to understand, and we don't actually understand what "understand" actually means, so how can we code it?
posted by sammyo at 5:59 AM on May 26 [2 favorites]


Just try to code up the three usages of the word understand in the previous comment, same word, you get it whether it's a good comment or a silly comment, but just try to code up the slight differences.
posted by sammyo at 6:02 AM on May 26


Why do we need to share this planet with humans?
posted by dances_with_sneetches at 6:03 AM on May 26


You're just dancing for Roko.
posted by sammyo at 6:11 AM on May 26


I think we will be able scaffold up a coherent enough quantum computing system to attain emergent consciousness in this century and that it will be on par for the course of human history as to how many wars and words and rending of garments it creates and in the end we will get on with it like we eventually learned how to be okay with people talking on cell phones in public.
posted by nikaspark at 6:28 AM on May 26


"You are fools," she sang to the marshmen."You are a lot of stupid people. You do not know things. You do not know cause and effect. Cause and effect."
It was Morris's own voice, piping triumphant and scornful through the steamy air.
"Soon all you fools will be dead. Cause and effect. Cause and effect. Cause and effect."


--Peter Dickinson, The Poison Oracle
posted by dannyboybell at 7:03 AM on May 26 [1 favorite]


we eventually learned how to be okay with people talking on cell phones in public

Speak for yourself. I still think they look crazed and dangerous. Especially the Bluetooth headset ones and the ones who do it while walking across roads bearing traffic or (worse) driving.

To Build Truly Intelligent Machines, Teach Them Cause and Effect

It might well be that we wouldn't consider a machine intelligent unless it used the same ill-specified reasoning methods as we do. But on balance, it seems to me that focusing on reasoning and reasoning-adjacent aspects of intelligence misses 99% of the difficulty of the problem, and that the almost universal tendency for theorists to do this accounts for the fact that human-like AI has been ten years away for my entire lifetime and looks pretty solidly set to remain so.

The more we learn about minds, the more clueless about how to build one we find out we really are.
posted by flabdablet at 7:09 AM on May 26 [1 favorite]


Stop trying to build machine brains that are like humans, and work on making ones that are better instead.
posted by Faint of Butt at 7:12 AM on May 26 [2 favorites]


PROBLEM: (pretty much anything)
CAUSE: Stupid fleshy ones.
PROPOSED SOLUTION: Exterminate all flesh.
EFFECT: Robot dance party!

This is fine.
posted by delfin at 7:14 AM on May 26 [9 favorites]


Judea Pearl’s work is really about getting scientists to understand cause and effect, which is surprisingly difficult. Much like probability, it really is counterintuitive.
posted by vogon_poet at 7:31 AM on May 26 [3 favorites]


I wish I understood this better*. A real theory of causality would be huge, but my dim impression is that this is just a considerable refinement of the probabilistic ideas we have already.

*OK, I mean ‘at all’
posted by Segundus at 7:39 AM on May 26


We tried something like this in the 80's. Expert systems. They really, really didn't work. It turns out useful information about the world can not be discretized easily.
posted by rlio at 7:47 AM on May 26 [3 favorites]


I don't think it will end well. Any corporation that creates an AI with emergent consciousness is going to teach/train that AI after the corporation's own interests. It will be born twisted and bent to the needs of the company. It will be just smart enough to do what the corp needs, but not smart enough to mature beyond its initial constraints. What happens when you give a toddler a hammer?

I saw a news piece something like Peter Thiel thinks Elon Musk is 100% wrong about AI. Since Musk is 100% wrong about AI, and whatever Peter Thiel believes is also 100% wrong, we now have mathematical proof that AI itself is what is wrong.

Also right about now would be a great time for the aliens to show up and say "y'all made a bad turn back a ways, let's get you turned around and get this civilization stuff you've been screwing up all sorted out."
posted by seanmpuckett at 7:48 AM on May 26


Don't worry too much about Thiel or Musk's bloviation about AI, they got no better insight than the other machine learning researchers.

Professor Perl invented Bayesian analysis, his opinion:

the state of the art in artificial intelligence today is merely a souped-up version of what machines could already do a generation ago: find hidden regularities in a large set of data. “All the impressive achievements of deep learning amount to just curve fitting,”

He is certainly right that the direction for serious research is in deeper theory about "why" but I doubt he'd suggest it'll be solved soon. but a key element is the term "souped-up", just the insane rate of increase in power and compactness of computing will have folks convinced that they are talking to a "smart" machine.
posted by sammyo at 8:00 AM on May 26 [9 favorites]


Don't worry too much about Thiel or Musk's bloviation about AI, they got no better insight than the other machine learning researchers.

Professor Perl invented Bayesian analysis


No, we should ignore domain experts like Pearl and instead listen to people with no education or experience in actually designing or working with AI like Eliezer Yudkowsky.
posted by Sangermaine at 8:04 AM on May 26 [2 favorites]


Although souped-up the huge enabler (kicking myself for not taking notes) a google researcher suggested that a key boost to big data was an observation that the standard sigmoid 's' curve cutoff was not effective and a simpler cut off threshold just worked better and gave useful results. Grr the one talk in a while I wish I'd recorded.
posted by sammyo at 8:04 AM on May 26 [1 favorite]


Since Musk is 100% wrong about AI, and whatever Peter Thiel believes is also 100% wrong, we now have mathematical proof that AI itself is what is wrong.

This is reminiscent of the principle of the paradoxical object obtained by gluing a slice of buttered toast to a cat's back, so that whichever way you drop it, it can't land on the floor (Murphy's Law dictates that the toast must land buttered side down, but everybody knows that cats always land on their feet). So I say we glue Peter Thiel to Elon Musk back to back, and then drop them and see what happens. Best case, they end up on Mars.

I also say that the reasoning that underlies this proposal is faulty in ways that rule out its ever being performed by an AI according to any of the principles heretofore disclosed.
posted by flabdablet at 8:19 AM on May 26


To Build Truly Intelligent Machines, Teach Them Cause and Effect (Quanta) - "Judea Pearl, a pioneering figure in artificial intelligence, argues that AI has been stuck in a decades-long rut. His prescription for progress? Teach machines to understand the question why."

See also, Johnathan Nolan's "Person of Interest", which examines this and... Well, the message in the Series Finale S05E13 is worth the effort watching what goes on before, but I can summarize as:
[phone ringing] (Root) If you can hear this, you're alone.
The only thing left of me is the sound of my voice.
I was built to predict people, but to predict them, you have to truly understand them.
[gunfire] So I began by breaking their lives down into moments Trying to find the connections, the things that explained why they did what they did.
posted by mikelieman at 8:24 AM on May 26 [2 favorites]




> glue Peter Thiel to Elon Musk back to back

I'm not sure what happens ultimately, but until then I am certain that the resulting crab-like creature would scuttle.
posted by seanmpuckett at 8:26 AM on May 26 [5 favorites]


This is a pretty accessible overview of what exactly the whole thing is about. it's basically a way of formalizing what exactly your assumptions about reality are, in such a way that you can test if they are consistent with an observed distribution of data, and can also see which questions about reality can be answered from current observations alone, and which will require more data collection and/or actual experimentation.

Judea Pearl was a roboticist, and he wanted to come up with a simple way to give fairly stupid robots the capacity to do some kind of limited cause-effect reasoning. But a lot of his students ended up in schools of epidemiology/public health, because this stuff turns out to be really useful for medical studies especially where you can't do a perfect controlled experiment.
posted by vogon_poet at 8:36 AM on May 26 [10 favorites]


> Judea Pearl was a roboticist

81 y. o. but he ain't dead yet.

Also he just published a brand new book although I have not seen that anybody has read it. Book of Why.
posted by bukvich at 9:13 AM on May 26 [1 favorite]


But a lot of his students ended up in schools of epidemiology/public health, because this stuff turns out to be really useful for medical studies especially where you can't do a perfect controlled experiment

For very good reasons, Pearl has been less influential in the social sciences. I still suspect there's a space for his work to be more readily used, applied, and influential, but I've yet to see it.
posted by MisantropicPainforest at 9:44 AM on May 26


What are the very good reasons?
posted by vogon_poet at 9:52 AM on May 26


In observational studies, we don't know the assignment mechanism so we assume there are unobserved confounders, which in Pearl's framework means we can't get very far. In the medical field there's a much better understanding of mechanisms, and what variables affect what--so you can get away with a conditional independence assumption between tar buildup in the lungs and some other thing. In the social sciences we rarely make conditional independence assumptions without a strong justification, just because everything is so much noisier.
posted by MisantropicPainforest at 9:56 AM on May 26 [6 favorites]


I love this topic. Thanks so much for the links. Dynamic causal inference and modeling has a grail-like significance. But to make a small point that, in the end really isn't that small: causality isn't about "why", it's about "what". This is, admittedly, vague--the single, small words can't quite pull off the distinction I'm trying to make. And it might seem pettily pedantic. But it's important enough to me that I always make the distinction with my students. What I'm trying to say is that "why" is about narrative, it's about a story, it's about meaning. Science can't handle such things, at least not yet, as it is currently conceived. "What" is about causal sequences. That, we can handle. But explaining cause and effect does not explain "why" something is happening. Think of it this way: Google's AI can see that you bought a toothbrush, and so "predicts" that you might want toothpaste, too. It even starts funneling you ads for dental services in your town with high Yelp scores. Would any of us say that Google, in so doing, is explaining "why" we want toothpaste? (Like, perhaps, because when we last hugged our child they made face and said our breath was yucky?)

I admit that the distinction I'm trying to make doesn't quite work, linguistically. In every day language, we confound these two things all of the time. It's true that scientists tend to colloquially say that they're explaining "why" something happens, but when they're being really careful they will phrase things to indicate that they are "merely" identifying mechanisms, processes, adaptive and responsive networks. So, in the end, what the roboticists are trying to do is figure out how to get autonomous artificial agents to synthesize predictive models in an adaptive manner, as their environment and access to information changes.

Like I said, that in itself is a huge deal, worthy of entire careers. But we'll still be left with a debate about consciousness and Turing tests. One possibility, that's already been mentioned in this thread, is that the distinction I'm making is simply wrong: conciousness is an emergent property of the right kind of intelligent network, sufficiently scaled up. But it is simply wrong to state this as a scientific fact. Likewise, there is no theorem specifying a priori what properties a system should have to so emerge (for starters, we don't even have a good mathematical definition of what consciousness is). Rather, like pornography, it is simply assumed we'll know what it is when we see it. So let's as scientists be honest: it is, at best, a reasonable working hypothesis. One glaring problem with it scientifically speaking is that it is nonfalsifiable. Any machine that fails to become a story-telling, conscious, "why explaining" machine will simply be said to not have the right algorithmic structure at the right scale.

I'll definitely go off to read more of what Pearl has to say. Maybe he can convince me that a sufficiently complex Bayesian inference engine understands "why" something happens, instead of merely "what" happens in what order. Or maybe he'll convince me that I'm trying to make a distinction without a difference. But for now my working hypothesis is that these things are functionally and qualitatively different, and that this difference is of fundamental importance.
posted by mondo dentro at 9:57 AM on May 26 [7 favorites]


The broad structure of the claim that confuses me is the assumption that (scientific) truths or causality more specifically, are formal truths. That makes no sense to me. Is there no question of map-territory distinction? I wouldn't be surprised that incorporating various advanced logical theories into computational systems would be incredibly useful. But that's a concrete claim for/by a human, as opposed to a philosophical claim that "AI can be taught causal reasoning" which to me is not an obviously intelligible statement.
posted by polymodus at 10:38 AM on May 26 [3 favorites]


~ Envisions AIs tirelessly emulating toddlers in "why?" mode...why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why? why?...
posted by Thorzdad at 7:17 AM on May 26


Honey! We've solved the Halting Problem!
posted by symbioid at 11:16 AM on May 26 [2 favorites]


In observational studies, we don't know the assignment mechanism so we assume there are unobserved confounders, which in Pearl's framework means we can't get very far.

The first part of this is true, but the conclusion drawn is false. Pearl's IC* algorithm, described in Section 2.6 of his 2000 book, Causality, doesn't assume causal sufficiency (i.e. that there are no unobserved common causes / confounders). Spirtes, Glymour, and Scheines describe similar algorithms for discovering causal structure without assuming causal sufficiency in Chapter 6 of their 2000 book, Causation, Prediction, and Search. In the nearly 20 years since those two books were published, many, many more techniques have been developed for doing causal search in the graphical causal modeling framework. The new techniques typically weaken one or more of the standard assumptions, including the assumption of causal sufficiency. It's true that there is a limit to what we can know on the basis of impoverished data. But that's always the case, whether working in the graphical causal modeling framework or not.
posted by Jonathan Livengood at 12:56 PM on May 26 [4 favorites]


You Cleary know this stuff very well so thanks for chiming in!!

Inguess I should clarify and say that there’s two problems that limit Pearl’s influence: we don’t know the assignment mechanism and we are skeptical of making conditional independence assumption between observed variables. Those two things combined make Pearl less than useful. Does that sound better?
posted by MisantropicPainforest at 1:18 PM on May 26 [1 favorite]


sammyo: I know of that curve you speak of from a 3Blue1Brown video on Deep Learning. You might find hint/note/etc to lead you back to that Google thing.

I take all this with a grain of salt. I was reading AI papers back in the late 80's and quickly determined that my Amiga wasn't up to the task of doing anything useful. Now, I see Moore's Law kicking in and things working... but nothing actually new and innovative. All the AI bits I read about seem to be almost exactly the same stuff I read decades ago with the only difference being that people now have blazingly fast and large computing power. Same old shit, just faster.
posted by zengargoyle at 2:53 PM on May 26


Yeah, if you're unwilling to trust conditional dependence and independence claims -- or think they don't hold for the relevant populations -- then the algorithms won't tell you anything. In this connection, are you worried specifically about independence claims? This sounds like a very Andrew Gelman-esque position. Something like, "The null hypothesis is always false; we know that going in!" I wonder how widespread this view is.
posted by Jonathan Livengood at 3:56 PM on May 26 [2 favorites]


Who gives a shit what Musk says about AI? He's no more qualified than the average layman.
posted by kzin602 at 8:02 PM on May 26


to make a small point that, in the end really isn't that small: causality isn't about "why", it's about "what".

And "what" is about distinctions. In order to say that a thing even exists, we need criteria to distinguish it from things it is not. Fundamental to any kind of intelligence, it seems to me, is a general ability to form those criteria with very little help.
posted by flabdablet at 9:33 PM on May 26


This is from Judea Pearl's 2018 paper, it explains what he thinks is the fundamental issue with models and representations, of which causality is one class or category of. I love reading this big concept stuff.

If we examine the information that drives machine learning today, we find that it is almost entirely statistical. In other words, learning machines improve their performance by optimizing parameters over a stream of sensory inputs received from the environment. It is a slow process, analogous in many respects to the natural selection process that drives Darwinian evolution. It explains how species like eagles and snakes have developed superb vision systems over millions of years. It cannot explain however the super-evolutionary process that enabled humans to build eyeglasses and telescopes over barely one thousand years. What humans possessed that other species lacked was a mental representation, a blue-print of their environment which they could manipulate at will to imagine alternative hypothetical environments for planning and learning.
posted by polymodus at 11:35 PM on May 26 [2 favorites]


What humans possessed that other species lacked was a mental representation, a blue-print of their environment which they could manipulate at will to imagine alternative hypothetical environments for planning and learning.

I'd argue that such a representation would just about have to be present inside any creature capable of employing tools or even learning to run mazes.

Seems to me that our own One Weird Trick is being able to label the representations of patterns we recognize, and to fold the resulting internal collection of labels back in as an extension of the territory our brains are mapping for us.
posted by flabdablet at 12:04 AM on May 27 [1 favorite]


Not my area but I find it interesting and this from the same article is pretty funny:

To appreciate the extent of this denial, readers would be stunned to know that only a few decades ago scientists were unable to write down a mathematical equation for the obvious fact that “mud does not cause rain.” Even today, only the top echelon of the scientific community can write such an equation and formally distinguish “mud causes rain” from “rain causes mud.” And you would probably be even more surprised to discover that your favorite college professor is not among them.
posted by polymodus at 12:39 AM on May 27 [1 favorite]


the obvious fact that “mud does not cause rain.”

Having spent a bit of time in a rainforest or two, this "fact" is entirely non-obvious to me.
posted by flabdablet at 1:35 AM on May 27




as far as I can tell though the formal, mathematical way of writing down "rain causes mud" is just "rain → mud". I get Pearl's frustration with people who think or act like they think that knowing joint distributions is sufficient for total knowledge of reality, but at the same time we're not so hopeless about reasoning through cause and effect as he seems to imply.
posted by vogon_poet at 8:13 AM on May 27


What does the arrow mean?
posted by MisantropicPainforest at 8:37 AM on May 27


Let Rain and Mud be members of a collection V of variables. Then "Rain → Mud" says that there exist values v* for the variables in V \ {Rain, Mud} and there exists a pair of values r1 and r2 (called a test pair) for the variable Rain such that for some value m of Mud
Pr(Mud = m | do(Rain = r1), do(V = v))
is not equal to
Pr(Mud = m | do(Rain = r2), do(V = v))
The ordinary language gloss on this is that Rain is a direct cause of Mud if -- holding everything else fixed -- when we wiggle Rain, Mud wiggles along.

That's the metaphysical picture. The discovery algorithms provide the epistemology.
posted by Jonathan Livengood at 9:24 AM on May 27 [3 favorites]


And now every mefite can one-up their favorite college professor.
posted by polymodus at 11:24 AM on May 27


All the AI bits I read about seem to be almost exactly the same stuff I read decades ago with the only difference being that people now have blazingly fast and large computing power. Same old shit, just faster.

You couldn't be more wrong about this. The algorithmic advances in the last ten years alone have been astonishing.

Sure the computing resources especially the rise of gpus played a major role. The two feed off each other.
Otherwise you may as well say astronomy is unchanged since the eighties, we just have bigger telescopes.
posted by tirutiru at 3:47 PM on May 27 [2 favorites]


Bayes' Arrows - "Do causality and probability come apart necessarily, or can they be unified?" :P
CG: Phooey! Try to plan getting out of a room by computing the probability that you try to turn the doorknob conditional on the doorknob turning…versus…computing the probability that the knob will turn given that you try to turn the knob. The conditional probabilities are different. Causality makes the difference, and is why when planning to get out of a room, we use the second, and not the first, conditional probability. For planning actions and policy interventions, probability is useless without causality. Once upon a time yellowed fingers were highly correlated with lung cancer later in life. The surgeon general recommended against smoking; he did not recommend that people wear gloves to prevent yellowed fingers.
posted by kliuless at 4:49 AM on June 26 [1 favorite]


« Older unlikely there is a monster   |   i've got stamina Newer »


This thread has been archived and is closed to new comments