Bayes on the Brain
February 22, 2021 2:42 AM   Subscribe

Some things in life are certain. For example, we all learn to identify colours at a young age, and can thereby say the same word when we point at a green object. Except that's not true - at all. To confirm that some thing is in the world, we can rarely (if ever) perceive it directly. Instead we gather indirect evidence, and hope to identify the presence of a tiger by the rustling of the grass. The mathematical theory of gathering indirect evidence about unknown things is known as Bayesian inference (Veritasium - main link?).

Bayesian inference is not just a theory of reasoning under uncertainty, but is arguably the theory. One argument for this is Cox's theorem, which proves that Bayesian probability is the only measure of uncertainty that consistently extends classical logic. Another is the Dutch Book Argument, which shows that if you (as in you, personally) do not follow the rules of Bayesian inference, then we (or perhaps I) could design a betting game that you would think is fair, and that we would inevitably win.

Bayesian inference is both a set of statistical techniques for weighing different scientific hypothesis, and a model of cognition in general. Many modern theories of neural computation have done away with a view of the brain as a computer, and instead view the brain as a system which seeks equilibrium between the evidence it gathers and its model of the world. Combined with a well-tuned body, we may find the equilibrium we seek at minimal metabolic cost (though please don't just hide in a dark room).

Do you find Bayes' rule confusing? Don't worry, most people do. And sometimes Bayesian thinking cannot solve all our problems. Still, with patience and practice, even babies can understand it.
posted by Alex404 (24 comments total) 36 users marked this as a favorite
 
There is a certain brand of thinker who have been in the news recently called "rationalists" who are somewhat obsessed with the idea of "Bayesian thinking". They are typically not familiar with actual Bayesian Statistics, because they are trying to apply it in their day to day without an actual formal appreciation of what the words "likelihood function" mean.

There was, at a certain point, an amount of conflict between so called "Bayesian" and "Frequentist" approaches to probability. In practice, most statistics that gets done in the wild in frequentist, partially because frequentist approaches are more embedded and also easier to do. When doing real work, it simply wasn't actually practical to be Bayesian until computers became practical enough, because the mathematics were intractable enough that in most scenarios you would need to use estimation techniques which required a great deal of cpu effort. I remember lecturers talking about running code overnight to complete a Bayesian computation.. the situation has improved, although this can still be the case with more complex models.

There are certainly situations where a Bayesian analysis is appropriate, but frequently in real settings the challenges which beset a statistician are to do with incorrect assumptions on the model, or biased data collection, none of which are obviously solvable by a Bayesian approach. To an extent, the onset of machine learning from computer science and the contrast with "traditional" statistics has rather over shadowed the conflict between Bayesian and Frequentists; the latter has always been a bit overblown.

The thing that always gets me with so called Bayesian thinkers is that they will claim to "update their priors" which is such an obviously incoherent thing to say. You update your posterior distribution when your prior information meets real data. But note that all of this approach makes a lot of assumptions about the models you are using.

In your day to day, there are some excellent places to apply Bayes theorem: correctly calculating the probability of a test working, for instance, or someone being guilty due to a particular piece of evidence. In those the maths is pretty transparent, and Bayes theorem is simply the correct tool for the job. The problem comes from when the situation becomes more complex and someone basically defends their intuitions by using a mathematical theorem they do not actually understand.
posted by Cannon Fodder at 3:35 AM on February 22 [19 favorites]


Cannon Fodder, you may have a point there about "updating your priors" being a bad way to phrase it, though I wonder whether the intent of "updating priors" is that your posterior prediction becomes your prior for the next round.

I like the idea of updating your ideas (frequently just a little) as new evidence comes in. So far as I know, it works well for becoming a better predictor of political events (Tetlock's work) but it's hard-- you need a lot of information plus not being attached to your earlier ideas.
posted by Nancy Lebovitz at 4:46 AM on February 22 [4 favorites]


"Updating your priors" is pretty standard talk in cognitive neuroscience; it's sometimes vague but it's also nothing profound. It's typically about time scales, i.e. the short time scale of an immediate task (finding food in a maze) and the longer time scale of learning (getting better at finding the food). The former is Bayesian inference, and the latter is updating your priors.

This isn't a defence of "rationalists" though. I don't think I know anyone who would call themselves a rationalist, but it reminds me of when I studied philosophy as an undergrad, and I'd occasionally meet people who'd claim things like "philosophy and logic have taught me how to win all arguments." In my head I'd always think, "I see that you and I have different definitions of winning."

Edit: Nancy Lebovitz's definition is also valid (according to me anyway).
posted by Alex404 at 5:05 AM on February 22 [2 favorites]


There was, at a certain point, an amount of conflict between so called "Bayesian" and "Frequentist" approaches to probability. In practice, most statistics that gets done in the wild in frequentist, partially because frequentist approaches are more embedded and also easier to do. When doing real work, it simply wasn't actually practical to be Bayesian until computers became practical enough, because the mathematics were intractable enough that in most scenarios you would need to use estimation techniques which required a great deal of cpu effort. I remember lecturers talking about running code overnight to complete a Bayesian computation.. the situation has improved, although this can still be the case with more complex models.

This is an important point. The distinction was always a fuzzy one and often driven by practicality. In the absence of workable Bayesian methods to do useful work, what else was there to do but to adopt a "wrong" frequentist approach? The development of efficient Monte-Carlo algorithms like NUTS to solve other-than-toy problems has made much more difference in how widely they are applied on practical problems than philosophy of statistics arguments about which approach is better.

I use the "update priors" expression all the time and I think it's fine, I've only ever read it to mean "use posterior distribution from this problem to update my priors for another, related problem". Just today I used it that way referencing how well vaccines are likely to prevent hospitalisation from Covid-19. There's results this morning that the Pfizer and AZ vaccines have more than 80% efficacy in reducing hospitalisations in the over 80s. That certainly updates what I think the prior is for how well other vaccines are likely to do that.

It is one of those ideas though that get strangely latched onto by self styled iconoclasts. I wonder if that has also led to the overblown view of conflict between "frequentists" and "Bayesians"? You can't be an iconoclast for adopting an idea that's uncontroversial which is why these people are obsessed with the idea that Bayesian statistics is controversial.
posted by atrazine at 5:25 AM on February 22 [7 favorites]



It is one of those ideas though that get strangely latched onto by self styled iconoclasts


i always imagined a particular subset of these iconoclasts thought something like this:

"YOUR PREJUDICES....errr.....WORLDVIEW.....errrr ASSUMPTIONS...errr PRIORS ARE WRONG AND ILL-THOUGHT OUT, IF THOUGHT OF AT ALL, NOT LIKE MIIIIIIINEEE reeeeeeeeeeeeeeee"
posted by lalochezia at 5:39 AM on February 22 [1 favorite]


if you (as in you, personally) do not follow the rules of Bayesian inference, then we (or perhaps I) could design a betting game that you would think is fair, and that we would inevitably win.

Quite possibly the most succinct account of capitalism I've ever seen.
posted by flabdablet at 6:52 AM on February 22 [8 favorites]


I think where 'rationalists' get into trouble is that they've convinced themselves that they are intellectually required to come up with a reasonable prior probability for any absurd situation anybody proposes. But I think it's perfectly rational to say "this scenario is so absurd that I have no way of meaningfully producing any kind of prior probability for it". You can meta-rationally admit that certain situations are outside of your reasoning system's domain.
posted by Pyry at 7:12 AM on February 22 [3 favorites]


Cannon Fodder's excellent comment reminds me of my favorite erratum ever:

As a result of an error at the Publisher, the term "frequentist" was erroneously changed to "most frequent" throughout the article. IOP Publishing sincerely regrets this error.

Anyway, as a mathematician and probabilist (of sorts), I too have very little patience for the frequent and broad misapplication of Bayesian methods, whether it's in pseudo-'rational' lifestyle blathering, or in current research literature. These mistakes are often tantamount to the error described above, but usually slip through due to lack of authors/readers/reviewers actually understanding the concepts.

Put another way: among ontologies of probability, I prefer frequentist the frequentest.
posted by SaltySalticid at 7:28 AM on February 22 [9 favorites]


Previously.

According to some applications of Bayes' theorem, God exists, but Jesus doesn’t.
posted by No Robots at 8:01 AM on February 22




To me the key concept of Bayesian reasoning as illustrated in machine learning and parts of the human mind is that the probabilistic reasoning and updating of information are part of one iterative process. Updating priors also makes sense because it's about updating prior assumptions used for future decisions, not the ones for decisions that already got made. I am not a big math person so cannot speak to technical details of the probabilities involved.

If "Rationalists" have poisoned the use of Bayesian to mean applying a simultaneous probabilistic reasoning and online updating algorithm, is there a better more generic one to use? Finding easy to understand terms in this area has been very hard and the concept is important
posted by JZig at 8:17 AM on February 22


I too have very little patience for the frequent and broad misapplication of Bayesian methods... in pseudo-'rational' lifestyle blathering.

On the other hand, what drives me crazy is people who attack folks, especially those in public life, who change their minds as data (or its analysis) improves, calling them "flip-floppers" and scathingly claiming that they lack principles. It's as though the targets had changed their minds for reasons of political expediency. The flip-flopper epithet is a small but dangerous weapon, part of anti-intellectualism and anti-science views, that gains its potency from thinking peoples' fears about being charged with hypocrisy.

Our society might be better off if the populace absorbed a layman's understanding of Bayesian thinking, even though the inevitable errors and simplifications would make statisticians and mathematicians cringe. For example, during COVID we've mostly benefitted from people gradually absorbing a few bedrock principles of epidemiology; yes, some folks are still screaming about Fauci "flip-flopping" about wearing masks, but for the most part people get it. Moreover, they're open to arguments that invoke epidemiology, vaccine science and, yes, statistics related to transmission rates, etc.

As a society, we need to laud the ability to change one's mind based on new evidence, or a better interpretation of it. Every now and then some statistics puzzle gains currency (e.g., the "Monty Hall problem") and there's a brief moment where progress is possible. If we strove for basic literacy in data analysis, maybe people who clocked "flip-flopper" as an insult would slowly become understood as rejecting the ability to reason.
posted by carmicha at 8:18 AM on February 22 [9 favorites]


Quite possibly the most succinct account of capitalism I've ever seen.

I think the Dutch Book Argument should always be formulated as a threat. Unfortunately, Bayesian probability has not made me rich :(
posted by Alex404 at 8:23 AM on February 22 [1 favorite]


On the other hand, what drives me crazy is people who attack folks, especially those in public life, who change their minds as data (or its analysis) improves, calling them "flip-floppers" and scathingly claiming that they lack principles.

It makes me furious when someone pulls up a 10 year old twitter post from someone who has clearly changed a lot since then and claims it is proof of "Hypocrisy" with their current posts. The human brain changes all the time (via a process much like Bayesian reasoning) and it's a good thing! The Internet has destroyed the concept of "the past" so it seems many people conceive of others as some sort of logical construct instead of thinking of them as people who have real experiences and use those to change their mind.

Part of this is also why I hate the term "rationalist" in the first place, because in common usage it means "good at applying boolean logic and making perfect decisions" which is kind of the exact opposite of Bayesian reasoning.
posted by JZig at 8:46 AM on February 22


I thought for sure the "even babies can understand it" would be a link to some of the Griffiths work on how (purely IMO) some of the most interesting language acquisition models are Bayesian processes.

I am adding that book to the list for our two-year-old though-- that series is great, and it contains some of her go-tos for selecting a bedtime book.*

* Including Quantum Entanglement for Babies, which unlike many of the other books, requires introducing people (Alice and Bob) into the mix and ends with the unsatisfying "...and no one knows why!"
posted by supercres at 9:32 AM on February 22 [1 favorite]


I'd kind of forgotten the frequentist vs Bayesian sniping, which seemed so big at the time.

If we strove for basic literacy in data analysis, maybe people who clocked "flip-flopper" as an insult would slowly become understood as rejecting the ability to reason.

The problem is beyond statistics, let alone the Bayesian/frequentist distinction. Anyone who's ever watched a police procedural understands that you need to change your theories as you collect more evidence (usually around the third commercial break.)

Many flip flops in the wild are truly hypocritical (see Republicans on mail in voting, confirming Supreme Court justices right before the elections, and the deficits) and it's better to point them out and treat them as bad faith arguments.

Unfortunately once an argument is useful it's form gets co-opted even when it doesn't apply, out of convenience, laziness or just plain old human fallibility. I don't know how you fix that, but it's not just knowing more statistics or better logic.
posted by mark k at 9:48 AM on February 22 [2 favorites]


Not to abuse the edit window, but on the Bayesian vs. frequentist thing: I saw people who mistrusted Bayesians with their priors, but with so little awareness of the issues that they railed against someone applying Bayes' Theorem to a problem.

Bayes' Theorem is a rule about conditional probabilities that works perfectly well in a classical context and shows up in the first few chapters of any textbook. It basically tells you things like "If you do your heart disease survey from a randomly chosen set of hospital patients, your study is not going to work very well."
posted by mark k at 9:54 AM on February 22 [1 favorite]


This is very interesting. I'm still working my way through it. (My default, arrogant, asshole approach is to claim that if your Bayesian statistics disagree with your frequentist statististics in a meaningful way, there's a very good chance you've posed a bad question in at least one of those cases. Perhaps the way that questions can be bad is different in a way that's useful.)
posted by eotvos at 10:04 AM on February 22 [2 favorites]


The thing is, Bayesians don't have ownership on updating your ideas as new evidence comes in.

Imagine how silly I'd sound if I said I was 'using roofer techniques' in my daily life just because I own a ladder.

Changing conclusions or opinions based on new information is just critical thinking and evidence-based reasoning; nothing Bayesian about it other than the fact that Bayesian methods get a lot of mileage out of that process, just like roofers use ladders to great effect.
posted by SaltySalticid at 10:54 AM on February 22 [7 favorites]


Bayes's Theorem is often stated in the forbidding form P(H|O) = P(O|H)P(H) / [P(O|H)P(H) + P(O|~H)P(~H)], which lends it some of its reputation as "non-intuitive". However, there is a much more natural way to carry out the same calculation using odds.

Odds express the same information as probabilities, just as a ratio. For example, if an event has a 3/5 probability of happening, then it has a 2/5 probability of not happening, so the odds are 3:2 in favor.

Now here's an example of Bayesian reasoning using odds. Imagine I'm anxiously awaiting an important piece of mail. At 3pm, I check my mailbox and there's nothing in it. This could mean one of two things: either the mailman hasn't been to my house yet, or they have been but there was no mail for me. To judge how likely each of these possibilities is, I'll use two historical observations.

On the one hand, let's say I've observed that the mailman comes before 3pm on 60% of days. On the basis of that information alone, the odds that they'd have been here by now are 3:2 in favor. This is my "prior" (sort of a baseline before I checked my mailbox).

On the other hand, the fact that I don't have mail in my mailbox is obviously pertinent. If they had been here, there's a good chance I'd have received mail, so the absence of mail should load the odds in favor of their not having come yet. Here's where I use my second observation, which is that I receive at least one piece of mail on 3 out of 4 days (that aren't postal holidays).* Thus, in a world where the mailman has already come, there's a 1/4 chance my mailbox is empty, but in a world where they haven't come, my mailbox is definitely empty. So the latter possibility now looks four times as likely as before, relative to the former. That is, my original 3:2 odds need to be weighted in the ratio 1:4. Conclusion: The odds that the mailman has been here are (3×1):(2×4) = 3:8, i.e. 3 chances they have to 8 that they haven't. (I still have hope!)

An important point here is that the result depended on both observations -- the original 3:2 prior and the 1:4 likelihood ratio. Change either of those and you get a different conclusion. The Bayesian criticism of frequentist hypothesis testing is that it essentially uses only the second piece of information. Like, say I can't find my cat, and I suspect this is because she has been eaten by aliens from the planet Melmac. My evidence is that 96% of the time, I can find my cat. So if aliens hadn't eaten her, there would be only a 4% chance I couldn't find her. I reject that as improbable! Hence, aliens!

Whereas to a Bayesian, if the prior credibility of the alien hypothesis was 1:1,000,000, then the cat evidence just re-weights those odds in the ratio 25:1, which leaves them at 25:1,000,000 -- aliens are still not very likely.

This is sort of a cartoon example -- frequentists are not so unsophisticated as that -- but it illustrates the problem with not having a prior. The problem with having one is, where do you get it? Priors are precisely what probability theory cannot tell you how to calculate. The 1:1,000,000 prior against the Melmac hypothesis is drawn from common sense. And that reservoir, while vital, is also contaminated by all our cognitive biases. My reading of Cox's Theorem is that math can't save us from that problem.

---
* My mail example has a little flaw: if we're waiting for a certain piece of mail, it's probably more likely than usual that we'll receive at least one piece of mail. Kindly ignore that...
posted by aws17576 at 11:05 AM on February 22 [7 favorites]


One further thought: in classical logic, you and I might agree on a statement of the form "If P, then Q", but not agree on whether P is true, and therefore come to different conclusions about whether Q is true.

Cox's Theorem basically says that allowing probabilistic judgment in lieu of binary true/false judgment doesn't make this problem go away.
posted by aws17576 at 11:15 AM on February 22 [5 favorites]


The following is not standard (at least, I don't think it's standard yet), but it absolutely should be, so I'm exhorting everyone here to join me in solving this coordination problem: Use the label "Bayes' Rule" to refer to a norm of rationality, which says that we should update our degrees of belief by conditionalization (set your new degree of belief that h, Cr_new(h), equal to your old conditional degree of belief in h given e, Cr_old(h | e), where e is your new evidence); and use the label "Bayes' Theorem" to refer to the trivial consequence of the axioms of probability and the definition of conditional probability that Pr(h | e) = Pr(e | h) Pr(h) / Pr(e). Do not use the label "Bayes' Rule" when you mean the theorem, and do not use the label "Bayes' Theorem" when you mean to be talking about norms of rational belief update.

Thank you for your attention.
posted by Jonathan Livengood at 2:46 PM on February 22 [4 favorites]


However, there is a much more natural way to carry out the same calculation using odds.

The medical test paradox: Can redesigning Bayes rule help?
posted by kliuless at 10:28 PM on February 24


I'm late to this discussion, but I thought it would be worth mentioning Andrew Gelman if you're serious about Bayesian analysis. Textbook (PDF, free for non-commercial use), software, blog.
posted by clawsoon at 5:13 PM on February 28


« Older Knights in armour hide under eyeliner.   |   Like Here, but Thicc Newer »


You are not currently logged in. Log in or create a new account to post comments.