Discover surprising correlations
July 8, 2011 6:05 PM   Subscribe

Have you ever tried to raise seamonkeys? 54 percent of atheists think people on a date should split the costs, compared with 29 percent of people in general. In general, 62 percent of people like spicy food. But among those who think flag burning should be illegal, 78 percent like spicy food. 61 percent of people who filter their tap water prefer credit cards over debit cards, compared with 43 percent of people in general.
posted by Brian B. (78 comments total) 36 users marked this as a favorite
 
This is some verrrry clever data mining
posted by gonna get a dog at 6:08 PM on July 8, 2011


This is the future. Actuaries will be working for all the major corporations, and will slowly build predictive models for everyone.

"Because he uses Crest toothpaste, is a college graduate, and has never bought a car less than 3 years after paying off the previous one, our model predicts that the subject will be more likely to choose Super Ranch Doritos than Nacho Cheese. Send a coupon via facebook."
posted by chimaera at 6:10 PM on July 8, 2011 [24 favorites]


This is brilliant. I now have answers to "cite plz!" for all sorts of outlandish claims. TROLL ALL THE THINGS!
posted by Dark Messiah at 6:10 PM on July 8, 2011 [37 favorites]


I think on one of my favorite xkcds ever:
I used to think correlation implied causation. Then I took a stats class. Now I don't.

Sounds like the class helped.

Well, maybe.
This is interesting regardless. Especially: 82 percent of people who call cola "pop" rather than "soda" are fluent in only one language, compared with 61 percent of people in general.
posted by King Bee at 6:18 PM on July 8, 2011 [6 favorites]


37% percent of people who fill out online surveys prefer dog racing to the word "cumberbund," while among the general population...aw, crap.
posted by Edgewise at 6:18 PM on July 8, 2011 [1 favorite]


Never heard of this journal before.
posted by Foci for Analysis at 6:19 PM on July 8, 2011 [7 favorites]


!00% of people who favorite this comment are beautiful and charming who go on to win the lottery.
posted by The Whelk at 6:22 PM on July 8, 2011 [39 favorites]


100% of people who try to capitalize numbers are prone to making typos.
posted by phunniemee at 6:24 PM on July 8, 2011 [94 favorites]


I wanted to be a part of that !00%!!
posted by King Bee at 6:25 PM on July 8, 2011 [7 favorites]


100% are American?

Also, is it just me or are there no links to sources?

Or is that the joke?
posted by ODiV at 6:26 PM on July 8, 2011 [1 favorite]


Each day, a new survey is posted — something like "Do you like math?" or "Have you ever gone on a blind date?"

At the end of the day, the results of the survey are compared with the results of all previous surveys, and the two outcomes with the strongest link are highlighted.


...

Suppose that 40 percent of users respond "Yes" to the survey question "Have you ever done a keg stand?"

We then go through all of previous survey questions and see how that subgroup — the people who have done a keg stand — answered.

Let's say that 65 percent of the subgroup responded "Yes" to the survey question "Do you subscribe to any magazines?" compared with 70 percent of people in general. That's a difference of 5 percentage points.

And let's say that 40 percent of the subgroup responded "Yes" to the survey question "Do you have curly hair?" compared with 20 percent of people in general. That's a difference of 20 percentage points.

We can therefore infer that having curly hair is more closely correlated with having done a keg stand than subscribing to a magazine is.


So the source is the site.
posted by Phyltre at 6:29 PM on July 8, 2011


0% of odivs read the entire site before posting a comment.
posted by ODiV at 6:30 PM on July 8, 2011 [9 favorites]


phunniemee..I try to capitalize numbers all the time! Y'know, when they're important or need emphasis.
posted by nile_red at 6:35 PM on July 8, 2011 [1 favorite]


This is kind of fun, but (as with all surveys) it is not obvious as to how the responses here generalize to the population large. These surveys are answered by English-speaking, web-savvy people, who comprise only a (small?) subset of the world as a whole.

I was thinking that it'd be nice to see p-values on some of these statistics, but with n > 100, I imagine that these correlations are statistically significant.
posted by tickingclock at 6:36 PM on July 8, 2011


I thought pop used for soda was a regional thing. Cola is of course a type of soda, as is baking soda, but I use baking soda to get rid of tummy ache brought on by soda, the pop kind.
posted by Postroad at 6:39 PM on July 8, 2011 [2 favorites]


73.4% of all statistics are completely made up.
posted by briank at 6:41 PM on July 8, 2011


I thought pop used for soda was a regional thing. Cola is of course a type of soda, as is baking soda, but I use baking soda to get rid of tummy ache brought on by soda, the pop kind.

I default to "coke", mostly for esoteric reasons. Frankly, I could care less what you mix my rum with.
posted by Dark Messiah at 6:47 PM on July 8, 2011


Interestingly, causation.org appears to be unclaimed at the moment. Someone should buy it and point it at correlation.org. That'll show 'em.

I thought pop used for soda was a regional thing. Cola is of course a type of soda, as is baking soda, but I use baking soda to get rid of tummy ache brought on by soda, the pop kind.

Your dad gives you tummy aches?
posted by ChurchHatesTucker at 6:50 PM on July 8, 2011 [15 favorites]


This is about as significant as Wii online surveys.
posted by anigbrowl at 6:50 PM on July 8, 2011 [1 favorite]


It only matters because the stakes are so low.
posted by Ideefixe at 6:56 PM on July 8, 2011 [2 favorites]


98% of atheists take the stairs; up from 32% two weeks ago.
posted by schmod at 7:05 PM on July 8, 2011 [9 favorites]


You're saying you have proof atheists do Ascend? That's gonna piss off a lot of people.
posted by Greg_Ace at 7:07 PM on July 8, 2011 [5 favorites]


I thought pop used for soda was a regional thing.

It is a regional thing. And in my experience the term "pop" seems to trend away from urban areas. "Soda" is almost always used by people on coastlines near urban areas. I personally find that urban areas have a higher population of people with passports, particularly near coast lines.

Just sayin'.
posted by Johnny Hazard at 7:08 PM on July 8, 2011 [3 favorites]


If I invite you, I pay. If you invite me, you pay. If we've mutually agreed to meet somewhere, we split the check.

It's pretty simple, really.
posted by BitterOldPunk at 7:08 PM on July 8, 2011 [2 favorites]


This is fantastic. Now, when I am describing to my students what it means for a correlation to be large but unimportant, I can point them to the correlation between being double-jointed and having testified in court.
posted by arcticwoman at 7:12 PM on July 8, 2011 [2 favorites]


90% of everything is bullshit

obligatory
posted by philip-random at 7:23 PM on July 8, 2011


I have a lawnmower. Do you like grapes? This follows the logic pattern of: If A=B, and B=C, does X=Y? Of course!!
posted by rhythim at 7:31 PM on July 8, 2011 [1 favorite]


Cabbages.
posted by flabdablet at 7:32 PM on July 8, 2011 [2 favorites]


This is wonderful. I'm keeping this in my back pocket for the next time I need to explain statistics to someone.
posted by pemberkins at 7:46 PM on July 8, 2011


33 percent of people who are double-jointed have testified in court, compared with 14 percent of people in general.


Those twisty bastards!
posted by ian1977 at 7:56 PM on July 8, 2011 [7 favorites]


74 percent of people who bite their nails understand HTML, compared with 58 percent of people in general.

Their general sample set is obviously anything but "average". It makes me question the usefulness of the other "people in general" stats.
posted by CaseyB at 7:59 PM on July 8, 2011 [2 favorites]


74 percent of people who bite their nails understand HTML, compared with 58 percent of people in general.

Geeks tend to be more nervous -- and therefore more nail-biting -- and also more likely to understand HTML, whatever that means.

In general, 24 percent of people belong to a credit union. But among people who refuse to tip a server if they receive poor service, only 8 percent belong to a credit union.

Credit-uniony-people are more likely to have sympathy for their fellow human beings.

72 percent of people who prefer the aisle seat on a plane have gotten a speeding ticket, compared with 55 percent of people in general.

People who get speeding tickets drive fast; they're trying to get where they're going faster. They also like being able to get off the plane faster.

Only 35 percent of people with a landline phone have worn braces, compared with 48 percent of people in general.

People with landline phones are older than average; braces have gotten more popular in recent times.

In general, 71 percent of people describe themselves as bad dancers. But among people who have never taught a class, 87 percent say they're bad dancers.

Teaching is sort of performing. So is dancing.

83 percent of people who have snuck into a movie without paying know how to tie a necktie, compared with 67 percent of people in general.

This is clearly backwards. The sort of people who sneak into movies without paying are bums, and when was the last time you saw a bum wearing a tie?
posted by madcaptenor at 8:00 PM on July 8, 2011 [1 favorite]


This seems interesting on the face of it, until you stop to think that they might be choosing for weird statistical correlations that happen by chance.

If you pool 500 people on a wide body of knowledge, there are going to be freak correlations with no meaningful basis. The more things you ask them about, and the fewer people in your dataset, the more likely those correlations become.

It's like, if you keep asking a bunch of people questions like, what's your favorite color, was your first pet a cat or a dog, do you prefer paper or plastic, etc., if you ask enough questions you'll eventually find an apparently-meaningful statistical connection between two of your questions. And just like the chances of two people in a group having the same birthday is wildly lower than intuition tells us, these correlations, too, could seem nonsensical.

To be certain of relevant correlation, you should run an independent poll that tests the data.
posted by JHarris at 8:03 PM on July 8, 2011 [4 favorites]


This is clearly backwards. The sort of people who sneak into movies without paying are bums, and when was the last time you saw a bum wearing a tie?

Well, to be fair it said know how to tie a tie. Not to be wearing one.
posted by ian1977 at 8:04 PM on July 8, 2011 [1 favorite]


until you stop to think that they might be choosing for weird statistical correlations that happen by chance.

They freely admit that they're doing that.

Also, countries with higher fluoridation rates have higher rates of death from cancer, and children with bigger feet spell better.
posted by madcaptenor at 8:07 PM on July 8, 2011


what kind of crazy person prefers lemons over limes? limes are tiny green miracles.
posted by neuromodulator at 8:08 PM on July 8, 2011 [16 favorites]


is askmefi a lime?
posted by madcaptenor at 8:11 PM on July 8, 2011 [2 favorites]


You can use statistics to prove anything. 38% of people know that.
posted by Gilbert at 8:15 PM on July 8, 2011 [1 favorite]


Wait, this isn't another OK Cupid post?
posted by R. Mutt at 8:21 PM on July 8, 2011


R. Mutt, that's actually what I thought it was going to be.
posted by madcaptenor at 8:23 PM on July 8, 2011


JHarris : This seems interesting on the face of it, until you stop to think that they might be choosing for weird statistical correlations that happen by chance.

Absolutely - But many of them make you at least stop and wonder if it really happens by chance, or if perhaps they have some common cause. For example, spicy food and flag burning... The further South you go, the spicier the food, and the more "patriotic" you find the locals. These may have no connection whatsoever beyond geography, but that doesn't mean they don't say something about each other.


And just like the chances of two people in a group having the same birthday is wildly lower than intuition tells us, these correlations, too, could seem nonsensical.

I'll pretend you said "higher" there. :) But that happens because you don't have independent selections in the birthday situation... So we have to ask, do flag burning and spiciness really count as totally independent?


Something most people miss about the classic "correlation does not equal causation" argument - You very rarely need to rigorously, mathematically prove causation to make use of the fact. If you notice that a particular blackjack dealer blinks more when he has a 20, it doesn't matter whether that happens because he has a nervous twitch, or because the face cards have a glossier surface and he has a light behind him at just the right angle - You gonna get rich at that table. Unless he moves to another table or changes decks, then it might matter - Or it might not.
posted by pla at 8:26 PM on July 8, 2011


Capitalizing numbers is the 3! problem I have using the shift key.
posted by maryr at 8:28 PM on July 8, 2011 [8 favorites]


And if the spicy food/flag burning correlation went the other way, you could tell a just so story about how that would be so: spicy foods tend to be "ethnic", "ethnic" foods and "ethnic"-food-likers are concentrated in urban areas, and people in urban areas are less patriotic and therefore less likely to oppose flag burning. (I was toying with this as an explanation until I actually compared the numbers.)
posted by madcaptenor at 8:29 PM on July 8, 2011 [1 favorite]


maryr, you have ^ problems?
posted by madcaptenor at 8:29 PM on July 8, 2011


No, I have 6 problems, obviously.
posted by maryr at 8:31 PM on July 8, 2011 [1 favorite]


If I invite you, I pay. If you invite me, you pay. If we've mutually agreed to meet somewhere, we split the check.

I would think there's always a mutual agreement to meet somewhere; one person didn't drag the other person there.
posted by John Cohen at 8:33 PM on July 8, 2011 [2 favorites]


Aw, people can come up with statistics to prove anything, Kent. Forfty percent of all people know that.
posted by Earthtopus at 8:41 PM on July 8, 2011 [1 favorite]


"100% of people enjoy answering questions." I wonder how they came up with that one.
posted by maryr at 8:43 PM on July 8, 2011


Coefficients or it didn't happen.
posted by aaronetc at 8:59 PM on July 8, 2011 [3 favorites]


Multiple hypothesis correction or it didn't happen.
posted by en forme de poire at 9:02 PM on July 8, 2011 [1 favorite]


JHarris "This seems interesting on the face of it, until you stop to think that they might be choosing for weird statistical correlations that happen by chance."

Sorry for linking in a second xkcd here, but I can't help feel that it's appropriate. Particularly given how this site creates the correlations... I can't think of a better explanation for why this site says nothing, than this comic.
posted by Arandia at 9:28 PM on July 8, 2011 [2 favorites]


"what kind of crazy person prefers lemons over limes? limes are tiny green miracles."

The fruit that most of us know as a lime "is usually sold quite green, although it yellows as it reaches full ripeness."

All my life I've been told this is because lemons and limes would be too easily confused at the cash register.

I do agree that they are miraculous. And delicious.
posted by bilabial at 9:28 PM on July 8, 2011


"... 54 percent of atheists think people on a date should split the costs, compared with 29 percent of people in general. In general, 62 percent of people like spicy food. But among those who think flag burning should be illegal, 78 percent like spicy food. 61 percent of people who filter their tap water prefer credit cards over debit cards, compared with 43 percent of people in general."

So, what?
posted by paulsc at 10:07 PM on July 8, 2011


In general, 31 percent of people can ski. But among people who are more scared of spiders than snakes, only 16 percent can ski.
I'd like to see a snow snake, it would add extra excitement as I'm throwing myself down the mountain.
posted by arcticseal at 10:07 PM on July 8, 2011 [1 favorite]


People who write about uneventful baseball games do the same thing.
posted by madcaptenor at 10:33 PM on July 8, 2011


33 percent of people who say they're absent-minded have worn braces, compared with 47 percent of people in general.

Eh, they've probably worn braces as much as everyone else, they just forgot.
posted by I've a Horse Outside at 11:18 PM on July 8, 2011 [2 favorites]


72 percent of people who prefer the aisle seat on a plane have gotten a speeding ticket, compared with 55 percent of people in general.

See, I read this as being age correlated. The older you are the more likely you've had a speeding ticket (longer driving record), and the more likely you are to be incontinent...
posted by kaibutsu at 2:19 AM on July 9, 2011 [1 favorite]


what kind of crazy person prefers lemons over limes?

The kind of person who wants to maximize their resistance to scurvy?
So when the Admiralty began to replace lemon juice with an ineffective substitute in 1860, it took a long time for anyone to notice. In that year, naval authorities switched procurement from Mediterranean lemons to West Indian limes. The motives for this were mainly colonial - it was better to buy from British plantations than to continue importing lemons from Europe. Confusion in naming didn't help matters. Both "lemon" and "lime" were in use as a collective term for citrus, and though European lemons and sour limes are quite different fruits, their Latin names (citrus medica, var. limonica and citrus medica, var. acida) suggested that they were as closely related as green and red apples. Moreover, as there was a widespread belief that the antiscorbutic properties of lemons were due to their acidity, it made sense that the more acidic Caribbean limes would be even better at fighting the disease.

In this, the Navy was deceived. Tests on animals would later show that fresh lime juice has a quarter of the scurvy-fighting power of fresh lemon juice.
posted by Kirth Gerson at 5:10 AM on July 9, 2011 [5 favorites]


I love this stuff.
posted by callmejay at 6:11 AM on July 9, 2011


These results probably are significant at p<.05, but the analyses fail to correct for the multiplicity of comparisons tested each day. If there are 100 previous surveys, they should only accept p<.05/100 as statistically significant. As currently presented it's all utterly meaningless Type 1 error.
posted by roofus at 6:20 AM on July 9, 2011 [3 favorites]


I prefer the aisle seat because it is easier to get to the bathroom from the aisle seat. I have never driven, hence no speeding tickets.
posted by Katjusa Roquette at 6:27 AM on July 9, 2011


"73.4% of all statistics are completely made up."

Cite?
posted by Eideteker at 7:04 AM on July 9, 2011


Sure!

Kane, B 2011, 73.4% of statistics are completely made up, throwaway comment, 9 July, Metafilter, viewed 10 July 2011, <http://www.metafilter.com/105357/Discover-surprising-correlations#3804881>.
posted by flabdablet at 10:33 AM on July 9, 2011 [2 favorites]


JHarris: And just like the chances of two people in a group having the same birthday is wildly [higher] than intuition tells us

pla: But that happens because you don't have independent selections in the birthday situation...

What do you mean, the birthday situation doesn't involve independent selections? The setup is exactly N independent selections from 365 possibilities.
posted by stebulus at 10:55 AM on July 9, 2011


The explanation I've always heard for why the birthday paradox is counterintuitive is that we're all self-centered fucks. So when we're asked how many people there need to be for two of them to have the same birthday, we instead think of how many people there need to be for one of them to have the same birthday as us.
posted by madcaptenor at 12:58 PM on July 9, 2011


Geeks tend to be more nervous -- and therefore more nail-biting

62.8318 percent of geeks consume over a half-gallon of coffee a day, compared to roughly 15 percent of people in general.
posted by Twang at 1:17 PM on July 9, 2011 [1 favorite]


If I invite you, I pay. If you invite me, you pay. If we've mutually agreed to meet somewhere, we split the check.

So how does one distinguish a suggestion to meet somewhere from an invitation? Serious question.
posted by Serene Empress Dork at 2:20 PM on July 9, 2011


Post to AskMe, then DTMFA. Probably safe to eat after that.
posted by maryr at 6:12 PM on July 9, 2011 [3 favorites]


correlation is not castration
posted by Eideteker at 11:00 PM on July 9, 2011


that makes this research paper SUPER AWKWARD
posted by The Whelk at 11:04 PM on July 9, 2011


love it.
posted by Dillonlikescookies at 3:40 AM on July 10, 2011


The site founder responded to critics yesterday.

Fun sucking aside, I would also add that this method would be ideal for medical research, via random emails, where it could be used to modify treatments or discover higher risks of disease. It could go a long way to lowering demand for healthcare.
posted by Brian B. at 8:36 AM on July 10, 2011


5% of people who know what a p value is think possibly spurious correlations are interesting , as opposed to 85% of the people who don't know what a p value is. Based on a sample of 100 people who have a clue and a literally infinite supply of those who don't.

I wonder what the correlation is between understanding stats and working for a newspaper?
posted by Mr. Gunn at 11:32 AM on July 10, 2011


His response to critics doesn't even mention the multiple testing problem, which is by far the most misleading omission of the site. At least give us a false discovery rate or something, dude.
posted by en forme de poire at 10:17 AM on July 11, 2011


until you stop to think that they might be choosing for weird statistical correlations that happen by chance.

They freely admit that they're doing that.


Which makes this site the exact equivalent of saying, "The other night I was at a restaurant with two guys who had both tried to raise seamonkeys. Freaky, huh?"

Or, more precisely, "The other day we got 20,000 people to answer some questions and there were a whole bunch of them who both liked spicy food and thought flag burning should be illegal. Freaky, huh?"

Or this XKCD comic
. For all we know, this list is entirely composed of the sort of spurious correlations you'll get 5% of the time you report correlations where p>0.05.
posted by straight at 10:17 AM on July 12, 2011 [1 favorite]


Brian B : Fun sucking aside

Y'know, I spent a good two or three minutes trying to figure out what you meant by that spoonerism. Then I realized, you meant it as written, not humorously. :I


stebulus : What do you mean, the birthday situation doesn't involve independent selections? The setup is exactly N independent selections from 365 possibilities.

The selections don't occur independently because in a group of N people, you can only make floor(N/2) pairs without reusing people.
posted by pla at 5:43 PM on July 12, 2011


Y'know, I spent a good two or three minutes trying to figure out what you meant by that spoonerism. Then I realized, you meant it as written, not humorously. :I

I should have linked it. On his response to critics page, he summarized by directing people to this page.
posted by Brian B. at 7:47 PM on July 12, 2011


The selections don't occur independently because in a group of N people, you can only make floor(N/2) pairs without reusing people.

Oh, I see, you mean selecting the pairs of birthdays. Sure, the {N choose 2} pairs of birthdays are not independent variables.

But then I don't understand how this explains the unintuitively high probability of a match. If we instead produce {23 choose 2} = 253 pairs of days independently (and uniformly among the 3652 pairs), the probability of a match is 1 - (364/365)253, about 0.5005. The difference between this number and the one in the actual scenario, about 0.5073, hardly seems enough to satisfy intuition.

(I took ordered pairs to get that 3652. If you choose unordered pairs independently and uniformly, it's easier to get matches, and the probability of a match in {23 choose 2} attempts is about 0.7519, which is in the wrong direction entirely.)
posted by stebulus at 11:40 PM on July 12, 2011


« Older Some Nice Background Music   |   On the failures of Canadian media and government Newer »


This thread has been archived and is closed to new comments