July 13, 2005 5:56 PM   Subscribe

People often say 90% of statistics are made up on the spot. This probably isn't true, but according to this scientific paper about a third of scientific papers turn out to be wrong. Perhaps we shouldn't be so quick to take published research at face value. (research applies to medical research, not other fields of science, as far as I can tell)
posted by delmoi (33 comments total) 1 user marked this as a favorite
Rutgers University says this is, among young white Canadian men ages 18-34, considered to be approximately 80% (3% +/-) bull puckey.
posted by DeepFriedTwinkies at 6:07 PM on July 13, 2005

Well, is this really that strange? First of all, science is not so much error free as error correcting.

Second, the high-velocity, market-driven nature of contemporary research makes it all the more likely that shoddy work gets published at first. Researchers just don't have as much time to do experiments several times to chase down mistakes, double check assumptions, etc. It's go go go, beat the other guy to press so that our group can get more $$$. Very American. Unfortunately. But it all gets sorted out eventually. That's how science works.

So, no biggy, as far as I can tell. Not optimal, but then, what is?

And if you compare it to, say, religion...

posted by mondo dentro at 6:08 PM on July 13, 2005

It's only 78.4% of numbers that are pulled out of people's asses.
They're the only figures that would advance the theory.

So really, statics scores a C+.
posted by Balisong at 6:08 PM on July 13, 2005

How do we know that THIS scientific study isn't one of the wrong ones?
posted by clevershark at 6:14 PM on July 13, 2005

research applies to medical research, not other fields of science

The take-home message, which is that it's really, really important to have independent replication of experimental studies, pretty much applies to all fields. This isn't really surprising, though. The reasons why any particular result may not be bulletproof vary by field, but they almost always exist. Scientific results are contradicted or overturned every day. Most scientists know this, although the occasional reminder is good. If only this wasn't so surprising to everyone else.

Science is not exact science.
posted by 김치 at 6:15 PM on July 13, 2005

It's research. It's trying to map the unknown. Results are bound to come out different until you are either reasonably sure of either your technique or your theory or your results (at which point, researchers move onto something else).
posted by carter at 6:19 PM on July 13, 2005

problem is drugs get approved based on these studies. look what happened to vioxx and bextra. advil and aleve recently got beat up. black box suicide warning for prozac et. al. use in children (and now they're considering the same black box warning for adults). baycor, a statin to lower cholesterol, was pulled from the market a few years ago and other statins are under similar scrutiny. several asthma drugs are currently being reviewed by the FDA as being dangerous. it's all about getting drugs approved faster to make more money.
posted by brandz at 6:34 PM on July 13, 2005

it's all about getting drugs approved faster to make more money.

I don't doubt that's true in part, but don't forget that the FDA has an incentive to get new drugs out (especially new categories of drugs) since they do save lives and all. A drug still in testing can't help anyone.
posted by thedevildancedlightly at 6:38 PM on July 13, 2005

Really now, I think we should be patting ourselves on the back for the finding that 70% of these papers are on the money, while only 30% are later tempered or reversed by further research.

I always found it funny when reading papers that the standard cutoff for significance is placed at p=.05, which is totally reasonable, but when you look at it, it creates a huge amount of studies which are totally wrong, or are at least are totally chance (~5% of the annual crop theoretically). So taking into account bias, bad design, this significance digit and all that, I have to say that having even half our papers come out right is quite a feat.
posted by BlackLeotardFront at 6:48 PM on July 13, 2005

The post is deceptively worded. Take a look at the results from the JAMA paper that it cites:

49 papers were examined.
45 claimed that an intervention was effective.
7 (16%) were contradicted (not proven wrong) by subsequent studies.
7 (16%) had subsequent studies find weaker effects
20 (44%) were replicated
11 (24%) remained largely unchallenged

Remember: a subsequent study challenging the results of a previous one doesn't necessarily "prove it wrong." And a subsequent study that confirms an effect but finds it to be weaker most certainly doesn't prove it wrong.

What is interesting here is the way that the conclusions of the study have travelled:

1) Study itself: 16% of highly-cited research is challenged in subsequent research, with an additional 16% confirmed with modification.

2) CNN.com: "Third of study results don't hold up."

3) Mefi FPP: "third of scientific papers turn out to be wrong."

Perhaps we shouldn't be so quick to take FPPs at face value.
posted by googly at 6:52 PM on July 13, 2005

Using your own FPP to point out a (hypothetical) weakness in another FPP is just wrong. Why didn't you post this as a comment in the other thread?
posted by mlis at 6:53 PM on July 13, 2005

Oh, you just want to keep driving while talking on the phone, delmoi.
posted by interrobang at 6:59 PM on July 13, 2005

problem is drugs get approved based on these studies.

This is absolutely true. It's a serious ethical concern. Everyone on campuses everywhere should be seriously looking at the conflict of interest problem. Unfortunately, money talks. And it's not just drugs, but new technologies of other types as well (like genetically modified whatsits, nanothingies, etc.)

On the other hand, the science bashing is occurring in a political context that should make us cautious about the motives of all concerned. (I'm thinking here about the "it's just a theory" crowd of medievalists we keep having to fend off.)

There's a lot at stake.

posted by mondo dentro at 7:13 PM on July 13, 2005

It would be interesting to see this kind of analysis taken a lot deeper on a much larger scale (49 is a pretty small group). It would be interesting to know how these numbers panned out on a truly representative scale, and it would also be interesting to find out more about what exactly is the issue with specific examples of contradicted findings or those which appear to be reported too strongly in original papers. Error in science runs the gamut from poor research design to sincere error to misinterpretation of valid data to genuine scientific fraud. Unfortunately the information provided doesn't really justify much by way of conclusions about significance one way or another.
posted by nanojath at 8:05 PM on July 13, 2005

a lot of the problem does have to do with this overreliance on the p-value, which comes from graduate statistics courses that assume we're all physicists and chemists, scientists who can control their experiments rigorously.

in many sciences, it's not appropriate, ethical, or even possible to control the subject studied as required by inferential or classical stats. also, p-value stats were developed with the idea that your result was either A or B, but in many sciences good results read like 30% A, 60% B. There are multiple causes acting in concert, and we usually want to know how relatively important they are, given our data.

when testing among multiple competing hypotheses, a lot of the practical statistics we're taught is about patching up / correcting for assumptions (normality, constant variance) that will not hold true--but scientists need them to hold true in order to use statistical formula in the stats software packages. in the end, the whole process can be very confusing, to the point where people are using statistics in ways they probably shouldn't, too often.

new methods, like bayesian inference, are being developed to solve a lot of these problems--the general philosophical differences among scientific disciplines, the problematic techniques used, and the hazy understanding, even within the scientific community, of the resulting analyses. Bayesian inference is in many ways more logically sound, but in turn more computationally intensive. Hopefully, the development of bayesian inference will lead to more lucid results, even if the methods demand high performance microcomputers.
posted by eustatic at 8:15 PM on July 13, 2005

In related news, Richard Smith, former editor of the British Medical Journal, has written a highly critical article about the relationship between pharmaceutical companies and medical journals:Medical Journals Are an Extension of the Marketing Arm of Pharmaceutical Companies. He stops a little short of accusing the pharmaceutical companies of generalized scientific fraud. What he says is that the companies are able to obtain the wanted results by asking the "right" questions and engineering the trials so the results will be always positive. Once the results are obtained, they are partitioned (by region, by age group, by ethnic group, etc) and then transformed into as many articles for as many journals as possible, allowing the companies to obtain an impression of massive scientific approval.

For instance, Paula Rochon and others examined in 1994 all the trials funded by manufacturers of nonsteroidal anti-inflammatory drugs for arthritis that they could find [7]. They found 56 trials, and not one of the published trials presented results that were unfavourable to the company that sponsored the trial. Every trial showed the company's drug to be as good as or better than the comparison treatment.

This has serious implications, specially to pharmaceutical companies, who are always quick to point out how their gigantic profit is necessary to ensure future funding for the enormous amount of research needed to discover and market new drugs and treatments.

In recent interviews (I wasn't able to find a link in English), Dr. Smith has gone as far as proposing a moratorium in all medical research publishing until the whole problem is reviewed, sorted out and journals can guarantee the quality of the papers they are publishing.
posted by nkyad at 8:24 PM on July 13, 2005

approving drugs doesn't necessarily save lives. see my previous post. many people die or are somehow injured by taking newly approved drugs. i don't believe the drugs are studied long enough before approval. there are studies called post-marketing studies which are never carried out by BigPharma. this is a huge problem within the industry.
posted by brandz at 8:53 PM on July 13, 2005

Using your own FPP to point out a (hypothetical) weakness in another FPP is just wrong. Why didn't you post this as a comment in the other thread?

It was new, and novel information which, while relating to an FPP from the other day, stood on it's own as postable material. Certanly better then "fixr"

approving drugs doesn't necessarily save lives. see my previous post. many people die or are somehow injured by taking newly approved drugs. i don't believe the drugs are studied long enough before approval.

Well, if you have a drug approval process which saves more lives going forward then it kills, then it's a good thing. In other words, if 1 one person is killed by drug A for every 2 people saved by drug B, both of which are approved then the process is doing a net good in the world. The trick is to tweak the process to improve the ratio as much as possible.

Certain AIDS drugs were approved in days, for example. More scrutiny needs to be placed on drugs which have a nebulous, slow-acting benefit (like cholesterol lowerers, or crap like Viagra) while less scrutiny should be placed on drugs which could immediately save lives (like the AIDS stuff)

Additionally, cost is a factor. If two much money is spent on exhaustively testing every drug approved, many drugs that might help fewer people will languish as FDA time is taken up on drugs that help larger groups.

It's all a balancing act, IMO.
posted by delmoi at 9:21 PM on July 13, 2005

The ultimate gatekeeper of science is neither peer reviews, nor referees, nor replication, nor the universalism implicit in all three mechanisms. It is time. In the end, bad theories don't work, fraudulent ideas don't explain the world so well as true ideas do. The ideal mechanisms by which science should work are applied to a large extent in retrospect... Time and the invisible boot that kicks out all useless science are the true gatekeepers of science. But these inexorable mechanisms take years, sometimes more than a millennium, to operate. During the interval, fraud may flourish, particularly if it can find shelter under the mantle of immunity that scientific eletism confers. Betrayers of the Truth (1982) p.106

Self-deception is a problem of pervasive importance in science. The most rigorous training in objective observation is often a feeble defense against the desire to obtain a particular result. Time and again, an experimenter's expectation of what he will see has shaped the data he recorded, to the detriment of the truth. This unconscious shaping of results can come about in numerous subtle ways. Nor is it a phenomenon that affects only individuals. Sometimes a whole community of researchers falls prey to a common delusion, as in the extraordinary case of the French physicists and N-rays, or -- some would add -- American psychologists and ape sign language.

Expectancy leads to self-deception and self-deception leads to the propensity to be deceived by others. The great scientific hoaxes, such as the Beringer case and the Piltdown man discussed in this chapter, demonstrate the extremes of gullibility to which some scientists may be led by their desire to believe. Indeed, professional magicians claim that scientists, because of their confidence in their own objectivity, are easier to deceive than other people. Betrayers of the Truth (1982) p.108
posted by bevets at 9:40 PM on July 13, 2005

clevershark and googly win!

"If your experiment needs statistics, you ought to have done a better experiment". Ernest Rutherford (1871-1937).

Although this quotation might be a bit of exaggeration, I do believe that statistical techniques are greatly misused even nowadays. And not only in the medical field but also in other sciences like... (cough!) in physics. I mean, think that in order to perform a proper statistical analysis you have to have a pretty good idea of the system itself. Which of course you don't usually/initially. Most of the studies overturned are found inadequate because exactly people realize there are aspects/characteristics/interdependencies which should have been taken into account but due to oversight, poor experiment planning or just pure lack of knowledge were not.
posted by carmina at 9:52 PM on July 13, 2005

bevets : "Time and the invisible boot that kicks out all useless science are the true gatekeepers of science. But these inexorable mechanisms take years, sometimes more than a millennium, to operate."

How do you know it works, then?
posted by Gyan at 9:55 PM on July 13, 2005

We didn't even have to use the E-word...
posted by drpynchon at 9:55 PM on July 13, 2005

I'm with googly.

Also, medicine isn't physics or chemistry - there are so many things that one cannot control for ("properly").

Take one population - the results will definitively say one thing. Take another population - not so much. From my experience in clinical testing of novel antibiotics, efficacy and side effects will differ if the trial is done in Europe compared to the same trial with the same protocol done in Africa.

Another thing that scientists can't control is how medical professionals administer and/or diagnose patients/treatments. It can be really frustrating when some of the clinical people just say "fuck it" because they're either too busy, don't care, or don't understand why the protocol is designed the way it is - not to mention the quality of the feedback that scientists sometime get from some them - and don't get me started on them being "we're MDs, we should be first author."


I feel that the contradictions of subsequent studies validates science as opposed to weakening it. If something is wrong, or inacurate, it gets fixed.

Also, this is a study done on "high profile" publications - there's something at stake for both the original study authors and the people who come to different conclusions.
posted by PurplePorpoise at 10:18 PM on July 13, 2005

The Smith article is excellent; thanks, nykad. And eustatic is correct regarding stats; the BMJ did an excellent series of articles exploring their use and limitations in study design. And yes, this post is loaded and inaccurate, but that in itself makes it a good springboard to consider the weaknesses of scientific literature and popular reportage, as people already have (well done also, googly).

A couple of other weaknesses not explored here:

Peer review. It's only as good as the society/publisher backing the journal, and then only as good as the section editors and reviewers have expertise and time to devote. Most journals rely on a volunteer pool, certainly for peer reviewers, and often for editorial boards members; the editor(s)-in-chief and section editors are generally the only staff compensated, and only the most prominent journals can afford to pay well. Scientists are supposed to participate for the good of science, with the side benefit of vitae enhancement or influence within their discipline. Since most publications want prominent, active scientists staffing their journals, you can see where this leads: very busy people with not a lot of time doing highly demanding, exhausting but ill-recompensed work.

Everyone wants studies out faster. It takes time to do lengthy, thorough peer review, but it also takes time to produce the material: to copyedit, typeset, proofread, review, and correct it prior to publication. Everyone in the science publishing industry is looking for the magic bullet to decrease turnaround time: online-only review and publication, shaving time off the many production processes, eliminating whole steps just to shove the stuff out there before it becomes irrelevant or scooped. Publication errors can be later corrected through errata, but given the speed at which marketing moves, results can be quite widely disseminated and then filtered through popular media long before anyone notices its flaws.

However, one obvious point to bear in mind is that it's called scientific literature for a reason. It's supposed to aggregate over time (though really, I don't have millennia in mind); no one study or small group should ever be considered authoritative, no matter how well designed and written, until it has been repeated and reported widely. That's why popularly reported science is such a frustrating, enormous mess: take a few studies, mix with the widespread scientific illiteracy and profit motives of popular media, filter it yet further through the general scientific illiteracy of the public, and whatever responsible science if any was there to begin with is now drunk, wearing way too much makeup, and shouting "Hey, look at these!" to anyone with the cash to buy it a jello shot. (Sorry, the prose was getting awfully dry.)
posted by melissa may at 10:21 PM on July 13, 2005

problem is drugs get approved based on these studies.

Well, no one is forcing you to take those drugs or see a doctor.
posted by c13 at 11:58 PM on July 13, 2005

we just haven't had enough time (or aren't smart enough) to find out what's wrong with the other two thirds...
posted by muppetboy at 12:02 AM on July 14, 2005

I'm glad that research papers are contradicted by subsequent research papers. Research should be carefully scrutinized and widely publicized within the scientific community. If some research claims fall apart after subsequent study, people are doing their jobs and we are learning something.

If there's anything wrong, it is that the news media turn early but promising medical research into this week's miracle cure. Then menopausal women, for example, start pressing doctors to give them hormone pills to protect them from heart attacks, before the next study shows that hormone pills actually increase their risk of heart attack. Doctors need to be careful that they don't respond to pressure caused by early research. Otherwise, all is well.
posted by pracowity at 12:26 AM on July 14, 2005

c13: wrong, sickness forces you to go see doctor and take medications according to doctors prescriptions, or die.

Therefore, why does sickness hate freedom so much ?
posted by elpapacito at 5:17 AM on July 14, 2005

Just goes to show that we need more money for science. We need people to test other people's work, but nobody gets grants for that, so they have to find excuses to justify doing the tests.

It also shows that we need some nonprofit drug development. How do we talk Gates into blowing off Africa (or wherever), and start funding a non-profit medical research (organization / patent house)?
posted by jeffburdges at 5:27 AM on July 14, 2005

googly made the statistics up.
posted by Pollomacho at 6:12 AM on July 14, 2005

This will probably turn out to be part of the one-third of studies that are wrong. Or half of studies that are re-examined as suggested above. Or else if this study is untrue it might be because all studies are untrue.

A goodly amount of skepticism should be used when examining a study. A former mentor of mine told me to first read the methodology section of a research paper. Methods equals results (all right, minus bias, statistical infractions and conflicts of interest). The results section and the abstract which the media tends to glom on to, is an interpretation, usually a generalization, of the results beyond which the methodology can support. The problem with a given study is usually hidden somewhere inside it.
posted by dances_with_sneetches at 6:35 AM on July 14, 2005

new methods, like bayesian inference, are being developed to solve a lot of these problems--the general philosophical differences among scientific disciplines, the problematic techniques used, and the hazy understanding, even within the scientific community, of the resulting analyses. Bayesian inference is in many ways more logically sound, but in turn more computationally intensive

Many people are EXTREMELY dubious about some of the ethical issues surrounding the use of bayesian analysis within the context clinical trials. Changing the rules of the trial mid-stream may constitute a reasonable logical inference based on early results, but it is often very difficult to discern the longer-term effects of a given treatment upon subjects, and such changes could have potentially harmful consequences. As far as I know there are still only a few champions of this approachworking on clinical trials.
posted by Dr_Johnson at 6:45 AM on July 14, 2005

This is interesting, coming from a perspective of political science or media studies. I think, perversly, that it's often easier to evaluate literature in "softer" science because the biases of methodology seem much more prominent. That could be, of course, due to my limited knowledge of hard science, but I find the credibility of a paper pretty easy to assess when you look at their methodology compared with the results. Did the study confirm a bias? Were the methods more or less likely to confirm that bias?
(Wow, Bevets almost has something to contribute here. Too bad it's not his own words...)
posted by klangklangston at 7:18 AM on July 14, 2005

« Older Rove   |   Happy Birthday, Metafilter! Newer »

This thread has been archived and is closed to new comments