Comments on: A critical moment in statistics
http://www.metafilter.com/102421/A-critical-moment-in-statistics/
Comments on MetaFilter post A critical moment in statisticsMon, 11 Apr 2011 13:01:29 -0800Mon, 11 Apr 2011 13:01:29 -0800en-ushttp://blogs.law.harvard.edu/tech/rss60A critical moment in statistics
http://www.metafilter.com/102421/A-critical-moment-in-statistics
<a href="http://en.wikipedia.org/wiki/Statistical_hypothesis_testing">Statistical hypothesis testing</a> with a <a href="http://en.wikipedia.org/wiki/P-value">p-value</a> of less than 0.05 is often used as a gold standard in science, and is required by peer reviewers and journals when stating results. Some statisticians argue that this indicates a <a href="http://www.johndcook.com/blog/2008/12/03/the-cult-of-significance-testing/">cult of significance testing</a> using a <a href="http://en.wikipedia.org/wiki/Frequentist_inference">frequentist</a> statistical framework that is <a href="http://news.sciencemag.org/sciencenow/2009/10/30-01.html">counterintuitive and misunderstood</a> by many scientists. Biostatisticians have <a href="http://www.annals.org/content/130/12/995.abstract">argued</a> that the (over)use of p-vaues come from "the mistaken idea that a single number can capture both the long-run outcomes of an experiment and the evidential meaning of a single result" and identify <a href="http://www.johndcook.com/blog/2008/11/18/five-criticisms-of-significance-testing/">several other problems with significance testing</a>. <a href="http://www.xkcd.com/882/">XKCD demonstrates</a> how misunderstandings of the nature of the p-value, failure to adjust for <a href="http://en.wikipedia.org/wiki/Multiple_comparisons">multiple comparisons</a>, and the <a href="http://en.wikipedia.org/wiki/Publication_bias">file drawer problem</a> result in likely spurious conclusions being published in the scientific literature and then being distorted further in the popular press. <a href="http://www.jerrydallal.com/LHSP/multtest.htm">You can simulate a similar situation yourself.</a> John Ioannidis uses problems with significance testing and other statistical concerns to argue, controversially, that "<a href="http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124">most published research findings are false</a>." Will the use of <a href="http://www.annals.org/content/130/12/1005.abstract">Bayes factors</a> replace classical hypothesis testing and p-values? Will something else?post:www.metafilter.com,2011:site.102421Mon, 11 Apr 2011 12:56:19 -0800grousesignificancepvaluebayesstatisticsstatsbiostatisticsbiostatsxkcdpublicationbiasmultipletestingresearchscienceBy: strixus
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631240
As a statistics nerd in a field outside the general sciences, this is an AWESOME post.
<small><em>... but minecraft!</em></small>comment:www.metafilter.com,2011:site.102421-3631240Mon, 11 Apr 2011 13:01:29 -0800strixusBy: gurple
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631241
I enjoyed that XKCD a lot. I enjoy most XKCDs that aren't fixated on the author's achingly awkward stance toward women.
In my line of work we often tend to calculate <a href="http://www.ncbi.nlm.nih.gov/pubmed/12883005">q-values</a> as a correction for multiple hypothesis testing. They're kind of fun, but they're not perfect and certainly not a universal solution.comment:www.metafilter.com,2011:site.102421-3631241Mon, 11 Apr 2011 13:01:42 -0800gurpleBy: atrazine
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631243
I can't be the only one who wants to see duel fought over Bayesian vs. Frequentist interpretations. Are you listening, Heidelberg?comment:www.metafilter.com,2011:site.102421-3631243Mon, 11 Apr 2011 13:02:49 -0800atrazineBy: theodolite
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631259
Half the problem (at least on the science-journalism side of things) is that the statistical meaning of "significant" has no resemblance to the English word "significant," which unfortunately has the same spelling and pronunciation.comment:www.metafilter.com,2011:site.102421-3631259Mon, 11 Apr 2011 13:07:55 -0800theodoliteBy: knile
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631266
See also <a href="http://www.indiana.edu/~kruschke/AnOpenLetter.htm">John Kruschke's Open Letter</a> and <a href="http://www.indiana.edu/~kruschke/publications.html">sundry other publications</a>, including <a href="http://www.indiana.edu/~kruschke/DoingBayesianDataAnalysis/">a textbook combining the MetaFilter favorites of "R" and "puppies"</a>.comment:www.metafilter.com,2011:site.102421-3631266Mon, 11 Apr 2011 13:10:07 -0800knileBy: 0xFCAF
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631281
Daryl Bem's use of p-values <a href="http://www.metafilter.com/99324/Should-have-seen-this-one-coming-too">to show 'proof' of retroactive human psychic powers</a> is another good example of misuse. There's a great <a href="http://dl.dropbox.com/u/1018886/Bem6.pdf">refutation (pdf)</a> of this showing how to use Bayesian models to correctly interpret the evidence.
My question for Bayesian testing in the sciences is: how do we assign prior probabilities? The author of the above refutation uses a value of <i>0.00000000000000000001</i> as his estimation of the chance that psychic powers exist, and that Bem's p < 0.05 result updates his posterior probability to <i>0.00000000000000000019</i>. With the same logic, we could assign a prior probability of 0.0000000001 that the luminiferous aether doesn't exist, and say the Michaelson-Morley experiment isn't convincing at all because it only updated our probability to 0.00001. Bem's definitely a quack, but pulling prior probabilities like that imply that it's almost literally impossible to convince you that your belief is wrong, which isn't a strong foundation for science.comment:www.metafilter.com,2011:site.102421-3631281Mon, 11 Apr 2011 13:14:04 -08000xFCAFBy: Blasdelb
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631285
<a href="http://forwearemany.wordpress.com/2009/12/28/mexican-lemons/">I'm just going to drop this in here</a>, my favorite graph in all of sciencecomment:www.metafilter.com,2011:site.102421-3631285Mon, 11 Apr 2011 13:16:25 -0800BlasdelbBy: jasper411
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631290
In his article "The Earth Is Round (p<.05)" Jacob Cohen (the seminal figure in power analysis) proposed renaming "null hypothesis significance testing" to "Statistical Hypothesis Inference Testing," so that it would have an appropriate acronym from his perspective.comment:www.metafilter.com,2011:site.102421-3631290Mon, 11 Apr 2011 13:19:15 -0800jasper411By: Nelson
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631302
This topic seems terribly important to me, particularly as we enter a new era of science that can easily comb through giant datasets for patterns. Back when I was a student experimental scientist the problem was always that statistics is hard, and boring, and you don't want to understand it. So you feed your data into the magic p-value machine and it comes out <0.05 and voila! you're done! Important scientific papers that rely on statistical analysis really should have a proper statistician as a co-author, not just someone who pasted some spreadsheet data into SPSS.comment:www.metafilter.com,2011:site.102421-3631302Mon, 11 Apr 2011 13:20:46 -0800NelsonBy: Mental Wimp
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631314
<em>With the same logic, we could assign a prior probability of 0.0000000001 that the luminiferous aether doesn't exist, and say the Michaelson-Morley experiment isn't convincing at all because it only updated our probability to 0.00001.</em>
I'm not sure I follow this. There were <a href="http://en.wikipedia.org/wiki/Michelson%E2%80%93Morley_experiment">many replications</a> of the Michaelson-Morley experiment. Are you suggesting that each experiment be considered on its own? Because Bayesians would cumulate them.comment:www.metafilter.com,2011:site.102421-3631314Mon, 11 Apr 2011 13:25:05 -0800Mental WimpBy: Mental Wimp
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631321
<em>This topic seems terribly important to me, particularly as we enter a new era of science that can easily comb through giant datasets for patterns.</em>
There are many frequentist and non-frequentist solutions proposed for this. One is the "<a href="http://en.wikipedia.org/wiki/False_discovery_rate">false-discovery rate</a>" which controls that number, similar to the way a p-value controls type-I error.
But this is, in fact, a big problem with big science. The dynamic tension between brute force methods and targeted, prior-knowlege-based methods revolves around exactly this problem. I, personally, am more sympathetic to the targeted approach, as I believe it is more likely to extend our knowledge than combing through random findings that are most likely nothing.comment:www.metafilter.com,2011:site.102421-3631321Mon, 11 Apr 2011 13:28:56 -0800Mental WimpBy: pucklermuskau
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631345
bayesian reasoning is great if you have good priors, but if you don't i really don't see it as any better than existing frequentist methods. What it comes down to is this: statistics are just a good, consistent way oftelling you what you know/have observed. It's never going to tell you anything more than what you have observed, and should not be regarded as anything more than a way of describing the data you have collected (which is still a powerful tool, but its not a means of deriving 'truth', at all.)comment:www.metafilter.com,2011:site.102421-3631345Mon, 11 Apr 2011 13:38:13 -0800pucklermuskauBy: blahblahblah
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631423
Right, but good methods help solve these problems, at least in social science.
First, we have hypothesis testing. If a researcher is sufficiently open and clear about the nature of your hypotheses, and they are not doing a data-mining expedition, then frequentist methods make a lot more sense. Even better if a researcher can use experiments or quasi-experiments (using a change in the law or some other external event) to make sure you know what you are measuring.
Second, any good (social) scientist is going to do robustness checks (resampling, using other methods or other measures) that should show if a result indicates something that is persistent, or just a statistical fluke.
Third, a researcher should have a sense of what their data means, which is why simply using large data sets results in little intuition. Explanations need to be tied to reality, not just statistics.
Finally, good research considers effect size, not just statistical significance. Small effects should not be taken as seriously as large ones.
Obviously, fakers gonna fake and data miners gonna mine, and some people never learn statistics, but it is possible to do really high quality work with p-values, if you want to do so.comment:www.metafilter.com,2011:site.102421-3631423Mon, 11 Apr 2011 14:07:28 -0800blahblahblahBy: Xoebe
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631460
Obligatory <a href="http://benfry.com/writing/archives/274">FSM link</a>.
<small>added bonus: values decrease in the X-axis</small>comment:www.metafilter.com,2011:site.102421-3631460Mon, 11 Apr 2011 14:16:59 -0800XoebeBy: SNWidget
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631567
As someone who's background is in education and music and is currently immersing himself in statistics for my PhD, this is extremely relevant and interesting. In my field, there's rampant misuse of statistics, especially when it comes to data mining, leaving off effect sizes, and not correcting for multiple comparisons. My adviser is basically as conservative as it comes with number play, and we go through articles in our field over and over to find things that are wrong, fudged, or done poorly. It's rather eye-opening to see so much research that has so many false assumptions based on poor data manipulation.
Although my PhD work is in music education, my related field is research design and statistics. After reading all of the poor research, I felt like I could make a change within the field. We'll see if that comes to pass, because let's be honest, I'm just a first year PhD student.
Like I said - I'm a musician and a teacher who's getting into the research side. I don't have a dog in this fight yet, but I guess I'll just keep reading until I do.comment:www.metafilter.com,2011:site.102421-3631567Mon, 11 Apr 2011 14:57:57 -0800SNWidgetBy: mixing
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631607
I've been using Bayesian methods for my own data analysis for about 10 years now, but most of my undergraduate teaching focuses on classical frequentist methods. When I first made the switch to Bayes, I thought to myself "this is revolutionary, it's going to change everything". And it might, I suppose. Bayesian methods do seem less prone to pathologies, on the whole. But to be honest, in my experience only a tiny fraction of statistical mistakes that I find in papers (both undergrad and academic) tend to be caused by the flaws in frequentist methods. The vast majority arise because the user doesn't understand the tool that they're applying: and so they're testing the wrong hypotheses and misinterpreting the results. A large scale switch to Bayesian methods won't fix this: my suspicions is that, at present, we don't see as many egregious screw ups with Bayesian statistics only because Bayesian scientists are a self-selected group of extremely statistically savvy users. If forced to become frequentists they wouldn't be making a lot of mistakes either. Maybe I'm just feeling especially old and cynical this morning, but I really think a lot of people are hoping Bayes will be a magic bullet. In the end, I reckon there's nothing for it except to have a lot more stats classes.comment:www.metafilter.com,2011:site.102421-3631607Mon, 11 Apr 2011 15:17:48 -0800mixingBy: Fraxas
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631650
Mixing, you mean there's nothing for it except to have a lot more <em>good</em> stats classes, right? I found my undergrad stats classes to be tedious, and it wasn't until later that I learned that the tedium wasn't the fault of the material, but the instruction.comment:www.metafilter.com,2011:site.102421-3631650Mon, 11 Apr 2011 15:40:13 -0800FraxasBy: oneswellfoop
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631671
I thought the most important factor in research was the WP-factor, WP standing for Who's Paying (for it).comment:www.metafilter.com,2011:site.102421-3631671Mon, 11 Apr 2011 15:48:49 -0800oneswellfoopBy: effugas
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631725
Someone want to post a quick comparison between Bayesian methods and Frequentist methods?comment:www.metafilter.com,2011:site.102421-3631725Mon, 11 Apr 2011 16:20:29 -0800effugasBy: stratastar
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631741
R... and PUPPIES?! MINDSPLOSION!!!!!comment:www.metafilter.com,2011:site.102421-3631741Mon, 11 Apr 2011 16:27:54 -0800stratastarBy: Blazecock Pileon
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631745
The Bayesian approach calculates a posterior probability, or what statistician <a href="http://en.wikipedia.org/wiki/Ronald_Fisher">R. Fisher</a> called the "reverse probability" from multiplying a "normalized" likelihood with the prior probability of the null hypothesis, or Bayes' theorem:
P(θ|X) = P(X|θ) P(θ) / P(X)
Where θ refers to determining the parameters (mean, variance, etc.) of a null hypothesis, given the empirical, or observed data set ("X").
In English, what's the probability that what you're looking at has a certain characteristic of interest, given the data set?
As a basic example, Bayes' theorem allows us to state the probability of the fairness of a given coin pulled from a bag of coins, both before and after any tosses, if we make <i>reliable</i> assumptions about the fairness of the coins within the bag, prior to taking any out. That prior assumption is where P(θ) comes in.
A frequent complaint about the Bayesian approach is that it makes subjective, sometimes contentious assumptions about the prior nature of the data being observed. The source of this philosophical contention in the use of Bayesian over classical frequentist testing stems from where (or, perhaps, with whom) these prior probability distributions originate.
So-called "non-informative" priors, such as the <a href="http://en.wikipedia.org/wiki/Jeffreys_prior">Jeffreys' prior</a>, express vague or general information about the parameter, to try to make as few "subjective" assumptions about the data as possible.
Informative priors use previous experience or information to set parameter values in the prior: e.g., a simple guess of the expected temperature at noon tomorrow could be calculated from today's temperature at noon, plus or minus normal, day-to-day variance in observed noon-time temperatures.
So-called "conjugate priors" are used when the prior distribution takes on the same form as the posterior distribution and when the mean and variance are usually dependent, and are therefore often used in analysis of empirical data, which is usually constrained by dependence.
The "empirical Bayes" approach was <a href="http://links.jstor.org/sici?sici=0373-1138(1963)31%3A2%3C195%3ATEBATT%3E2.0.CO%3B2-0">introduced by Herbert Robbins</a> as a way to infer how accident-prone someone is, given the observed fractions of accidents already suffered by the larger population. The objectivity of this testing stems from using observed, "empirical" data to generate informative priors used in Bayesian inference.
Many empirical Bayesian techniques have been applied in various areas within the field of <a href="http://en.wikipedia.org/wiki/Systems_biology">systems biology</a>, which are data-rich and analysis-poor. In particular, Brad Efron at Stanford is one of the big names in this field, and <a href="http://www-stat.stanford.edu/~ckirby/brad/papers/2005NEWModernScience.pdf">wrote a fun, straightforward paper</a> that bridges Bayesian and Frequentist modes of thinking. Other papers of his on the subject of empirical Bayesian testing <a href="http://www-stat.stanford.edu/~ckirby/brad/papers/">can be found here</a>.comment:www.metafilter.com,2011:site.102421-3631745Mon, 11 Apr 2011 16:29:59 -0800Blazecock PileonBy: mixing
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631785
<a href="http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631650">></a> <i>Mixing, you mean there's nothing for it except to have a lot more good stats classes, right?</i>
Oh yes. A million times yes. But that's a rant for a different thread.comment:www.metafilter.com,2011:site.102421-3631785Mon, 11 Apr 2011 16:47:41 -0800mixingBy: klausman
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631797
Here's another <a href="http://jrh794.wordpress.com/2011/02/01/feedback-from-the-real-world-the-fate-of-the-p-value/">post</a> concerning this.
I think about this every time we begin "hypothesis testing" in an introductory stats course, and I try to mention to students how overly simplified the process appears. But then, I end up bashing so much math done in classes up to that point (no straight lines in nature, no easy integrals in practice, etc.). I often don't feel like I'm selling the right product. How do folks here think that an introductory statistics course should address significance tests? What should we be focusing on?comment:www.metafilter.com,2011:site.102421-3631797Mon, 11 Apr 2011 16:53:13 -0800klausmanBy: TwelveTwo
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631807
<em>Will the use of Bayes factors replace classical hypothesis testing and p-values? Will something else?</em>
I don't know! Quit asking me!comment:www.metafilter.com,2011:site.102421-3631807Mon, 11 Apr 2011 16:59:09 -0800TwelveTwoBy: DataPacRat
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631809
<a href="http://yudkowsky.net/rational/bayes">An Intuitive Explanation of Bayes' Theorem</a> by Eliezer S. Yudkowsky.
<a href="http://commonsenseatheism.com/wp-content/uploads/2011/01/An-Intuitive-Explanation-of-Bayes-Theorem-1-4-2011.pdf">An Intuitive Explanation of Eliezer Yudkowsky's Intuitive Explanation of Bayes' Theorem</a> by Luke Muehlhauser.comment:www.metafilter.com,2011:site.102421-3631809Mon, 11 Apr 2011 17:00:15 -0800DataPacRatBy: Pinback
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631926
effugas<a href="http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631725">:</a> One of the best layman-ish explanations I've seen is the paper <a href="http://arcue.botany.unimelb.edu.au/files/bayesian/PriorInformation.pdf">"Profiting from prior information in Bayesian analyses of ecological data"</a> which, even if you ignore the (minimal) maths and have no understanding of ecology, does a fairly good job of explaining where & how Bayesian stats differs from more familiar frequentist methods.comment:www.metafilter.com,2011:site.102421-3631926Mon, 11 Apr 2011 18:07:11 -0800PinbackBy: solipsophistocracy
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631946
It is a heinous bummer that more people aren't up on their stats, but the fact of the matter is, it's about 1000 times easier to get a paper published with that magic <em>p</em> even if the effect is miniscule. Got a robust effect and a confidence interval you can live with? Good luck publishing it if your hypothesis test is significant at .051+.
"But wait," you (or Rosnow & Rosenthal, back in '89) say, "Surely, God loves the .06 nearly as much as the .05."
God? Probably yes. Reviewers who don't understand any statistical procedure more nuanced than a t-test? Unfortunately, probably not. (There are more of them than any of us would like to admit).comment:www.metafilter.com,2011:site.102421-3631946Mon, 11 Apr 2011 18:19:07 -0800solipsophistocracyBy: milestogo
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631967
<em>Mixing, you mean there's nothing for it except to have a lot more good stats classes, right?</em>
I think that at a certain point of complexity, the nuances of avoiding statistical pitfalls can just be too damn hard for every person involved in research to be expected to be able to steer clear of them.
The math is hard, and one or two classes taken during a graduate degree isnt typically enough to impart the careful thinking required to produce valid results. Certainly, better education is part of the answer, but (as mixing hints) perhaps the best solution is to employ expert statistians to "certify" statistical results. I'm imaging something akin to 'professionals in the scientific method' - perhaps employed by journals, perhaps separate.
This might smack of elitism, but statistics doesn't seem to be like calculus where you can study for a few terms and solve <em>all</em> of a broad, applicable class of problems. Each real world situation requires lots of careful study, with a mind honed by thinking about these issues.comment:www.metafilter.com,2011:site.102421-3631967Mon, 11 Apr 2011 18:28:48 -0800milestogoBy: straight
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631968
Here's the link you wanted for <em><a href="http://www.phdcomics.com/comics/archive.php?comicid=1174">and then being distorted further in the popular press</a></em>comment:www.metafilter.com,2011:site.102421-3631968Mon, 11 Apr 2011 18:29:52 -0800straightBy: pla
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3631995
Please don't twist Munroe 's sublime sense of humor into yet another anti-intellectual pile of dog turd.
Read the alt text on that comic - "So, uh, we did the green study again and got no link. It was probably a--"RESEARCH CONFLICTED ON GREEN JELLY BEAN/ACNE LINK; MORE STUDY RECOMMENDED!"
Even <i>he</i> makes fun of how the likes of Jenny McCarthy abuse false positives. Most importantly, a study means nothing unless someone else can <b>reproduce it</b>. And not <i>just</i> one reproduction - If one person gets similar results but 100 refute it, you consider the findings refuted. Yes, it may well warrant further study if you think some ambiguity in the methodology led to different outcomes, but we don't just start dealing cards and write a paper when someone deals the ace of spades (p<.02)comment:www.metafilter.com,2011:site.102421-3631995Mon, 11 Apr 2011 18:42:04 -0800plaBy: milestogo
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632010
<em>If one person gets similar results but 100 refute it, you consider the findings refuted</em>
How many people do you think would spend the time ( and how many organizations the money) to reproduce multi-year jellybean* studies? This isn't measuring the charge of the electron, where every interested scientist runs back to their lab to set up the experiment. A lot of the research being discussed isnt easily reproducible for financial reasons, or because it just might not be interesting enough for people to spend their time <em>confirming. </em>comment:www.metafilter.com,2011:site.102421-3632010Mon, 11 Apr 2011 18:47:24 -0800milestogoBy: milestogo
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632011
*lots of stand-ins for jellybeans here. Studies that aim to show causal links in long term causes and effects across broad populations.comment:www.metafilter.com,2011:site.102421-3632011Mon, 11 Apr 2011 18:49:24 -0800milestogoBy: pla
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632065
<b>milestogo</b> : <i>How many people do you think would spend the time ( and how many organizations the money) to reproduce multi-year jellybean* studies?</i>
You jest? You've just described the dream-career of many academics. 30 years of funding, with no publish-or-perish threat to their tenure, to study jelly beans? Sweet (no pun intended)!
Now, the fact that the Ric Romero may somehow get a copy of the original write-up and send the public on a mad rampage against green jellybeans has little bearing on the scientific community. You can read about trivial-but-long-term things proven or disproven (sorry - "rejecting or accepting the null hypothesis") on a weekly basis in every major journal on the market.
Just because most people don't care about the mating habits of the three-toed sloth doesn't mean you don't, <i>somewhere</i>, have a dozen biologists trying to find a way to strap electrodes to the poor bastards' genitalia at any given time.comment:www.metafilter.com,2011:site.102421-3632065Mon, 11 Apr 2011 19:11:25 -0800plaBy: milestogo
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632104
<em>You can read about trivial-but-long-term things proven or disproven...on a weekly basis in every major journal on the market.</em>
In Ioannidis' paper<a href="http://jama.ama-assn.org/content/294/2/218.full.pdf"> "Contradicted and Initially Stronger Effects in Highly Cited Clinical Research"</a> he claims that "Of the 45 highly cited studies with efficiency claims...11 (24%) had remained largely unchallenged" (page 3 of the PDF, middle column). The table later in the article gives specifics. Sorry I'm on my phone and can't be more clear, but those are the kinds of things I was thinking of when I wrote my previous comment.
Just because someone publishes a paper claiming something doesn't mean someone else will try to reproduce it, three-toed sloths not withstanding (I agree they are awesome). There is always <em>some</em> least interesting study.comment:www.metafilter.com,2011:site.102421-3632104Mon, 11 Apr 2011 19:40:31 -0800milestogoBy: chortly
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632133
As we all heard last week (with the Tevatron particle), 5 sigmas is what it usually takes in physics. Is there any reason just setting the traditional p-value to 5 sigmas wouldn't clear up the problem of false positives (at least for usual science, if not the genome-wide data-mining stuff) without need for any complicated bayesianism? True, 99% of social science and psychology would go poof, but that seems like a small price to pay for confidence. Or is it? What's so bad about 99% of published stuff being likely wrong, as long as it gets righter over time? (For non-medical, non-life-threatening research, that is.)comment:www.metafilter.com,2011:site.102421-3632133Mon, 11 Apr 2011 20:02:02 -0800chortlyBy: storybored
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632198
<em>Will the use of Bayes factors replace classical hypothesis testing and p-values?</em>
I'm 95% sure it won't.comment:www.metafilter.com,2011:site.102421-3632198Mon, 11 Apr 2011 21:10:45 -0800storyboredBy: sebastienbailard
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632260
<em>Is there any reason just setting the traditional p-value to 5 sigmas wouldn't clear up the problem of false positives?</em>
It would mean testing a prospective medication on 10,000 human subjects. And then having to pass the costs onto the patient ...comment:www.metafilter.com,2011:site.102421-3632260Mon, 11 Apr 2011 22:15:50 -0800sebastienbailardBy: pla
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632400
<b>chortly</b> : <i>True, 99% of social science and psychology would go poof, but that seems like a small price to pay for confidence.</i>
I think most of us will agree that the "soft" sciences have drastically less rigor than particle physics. You can smash two particles together under the same conditions and get the same outcome; You can't stick two people in a room together and predict, with confidence, how they will behave toward each other based on what color they wore that day.
Unfortunately, the soft sciences also have the annoying potential to answer some of our deepest questions about our existence. Saying that group X tends to stay married 4.7 years longer than group Y, and do so because of increased "happiness", means a lot to the average Joe. The mass of the Higgs Boson, not so much.comment:www.metafilter.com,2011:site.102421-3632400Tue, 12 Apr 2011 03:41:27 -0800plaBy: ROU_Xenophobe
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632521
<i>True, 99% of social science and psychology would go poof</i>
No reason it should. Getting t/z statistics beyond 5 is hardly rare, even with small datasets. I just got a t statistic of almost 20 with an N of 25. It was for something more or less blindingly obvious that for inexplicable reasons hadn't seen a formal test yet, but still.
<i>I think most of us will agree that the "soft" sciences have drastically less rigor than particle physics. You can smash two particles together under the same conditions and get the same outcome; You can't stick two people in a room together and predict, with confidence, how they will behave toward each other based on what color they wore that day.</i>
Even if the first were true, the second part of your statement wouldn't be a good example of that. Where social science is nonrigorous is where it has slapdash, crappy theories of what's going on. The fact that social science doesn't deal with mere lumps of dumb matter smacking into each other or just sitting there gravitating doesn't make it nonrigorous, just very difficult and unlikely to lead to precise predictions.comment:www.metafilter.com,2011:site.102421-3632521Tue, 12 Apr 2011 06:46:50 -0800ROU_XenophobeBy: AndrewKemendo
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632526
All that for an xkcd post?comment:www.metafilter.com,2011:site.102421-3632526Tue, 12 Apr 2011 06:49:55 -0800AndrewKemendoBy: roystgnr
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632659
<blockquote>With the same logic, we could assign a prior probability of 0.0000000001 that the luminiferous aether doesn't exist, and say the Michaelson-Morley experiment isn't convincing at all because it only updated our probability to 0.00001.</blockquote>
This corresponds to a likelihood ratio p(MM|~LE)/p(MM|LE) of (1e5-1e-5)/(1-1e-5).
Working through the arithmetic, the first time that experiment was independently replicated it would update our probability again, not to 0.00002, but to approximately .5. The next such replication would update us to approximately .99999, and with just four completely independent experiments we get to .9999999999.
So yeah, lousy priors can be a problem, but Bayesian calculations grow even tiny probabilities logarithmically, so (as long as you never have a prior of 0 or 1) you can still get to the truth surprisingly fast.comment:www.metafilter.com,2011:site.102421-3632659Tue, 12 Apr 2011 08:44:55 -0800roystgnrBy: winna
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3632732
<em>Important scientific papers that rely on statistical analysis really should have a proper statistician as a co-author, not just someone who pasted some spreadsheet data into SPSS.</em>
I a little bit hate SPSS. I'm in a marketing research class right now and we're using SPSS. Every time I ask for an explanation of the math behind what we're doing, I'm told 'Don't worry about it, just follow the steps.'
So I'm having to do all this extra research on the side to understand what we're doing, and it's not even my primary focus. But I hate <em>hate</em> when people tell me just to follow the steps.
From the reactions of my classmates whenever I ask for additional detail, I'm the only one who cares about it, though. Good thing most of them aren't going into research.comment:www.metafilter.com,2011:site.102421-3632732Tue, 12 Apr 2011 09:32:45 -0800winnaBy: solipsophistocracy
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3633156
<em>I think most of us will agree that the "soft" sciences have drastically less rigor than particle physics.</em>
This may be true of particle physics, but in psychology (which, I feel obligated to point out, is a discipline under whose broad umbrella research ranges from molecular to cultural levels of analysis), the 'softer' end of the spectrum often has far more rigorous statistical requirements for publication.
For example, I read a great paper in which the researchers took single cell recordings from humans while they were playing a game (that's pretty much like Crazy Taxi) in order to study spatial navigation. The stats were pretty minimal. They reported means and standard deviations for each of the conditions, and did a half-assed chi-square. You tend to be able to get away with stuff like this when you can make an argument like "we COUNTED the damn spikes, ok? You can SEE it."
To get a paper into JPSP (Journal of Personality and Social Psychology, probably the preeminent one in these fields), it is almost a prerequisite these days that your manuscript include a mediation analysis.
The less abstract the variables you're trying to measure, the more you can get away with minimal statistical testing. I'm not saying I think that this is the way things should be, but from my perspective, it's the way things are.comment:www.metafilter.com,2011:site.102421-3633156Tue, 12 Apr 2011 13:02:49 -0800solipsophistocracyBy: chortly
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3634008
I might (belatedly; sorry) add that I was speaking a bit facetiously in my earlier post. I guess my point was just that if you're super-worried about false positives, you arguably don't need bayes, just a higher sigma level. But in social science and medicine, false negatives are a serious problem too: you don't want to miss out on an important social or medical intervention that might save lives because you are excessively worried about the danger of statistical mis-steps. Bayes doesn't really solve this; you just have to accept that you'll find a certain number of false positives but that the cost -- in the form of unnecessary medicines and social policies and misled future researchers -- is less than the benefit of saving lives that a more cautious approach might missed by years or decades. I suppose in the end it matters what exactly the false positive rate in publication is: if it's 99.99%, that's a problem, but if it's merely 90%, that might be worth it for the true positives (and their social benefits) that are being turned up.comment:www.metafilter.com,2011:site.102421-3634008Tue, 12 Apr 2011 19:15:49 -0800chortlyBy: mrzer0
http://www.metafilter.com/102421/A-critical-moment-in-statistics#3639621
Did you know there's a direct correlation between the decline of spirograph use and the rise of gang activity? Think about it!comment:www.metafilter.com,2011:site.102421-3639621Fri, 15 Apr 2011 12:02:26 -0800mrzer0