The Reproducibility Initiative
August 14, 2012 11:40 AM Subscribe

The Reproducibility Initiative "Here’s how it is supposed to work. Let’s say you have found a drug that shrinks tumors. You write up your results, which are sexy enough to get into Nature or some other big-name journal. You also send the Reproducibility Initiative the details of your experiment and request that someone reproduce it. A board of advisers matches you up with a company with the experience and technology to do the job. You pay them to do the job...and they report back whether they got the same results."
posted by dhruva (69 comments total) 17 users marked this as a favorite

A number of other journals have also agreed to add a badge to all papers that have been replicated through the Reproducibility Initiative.* Think of it as a scientific Good Housekeeping seal of approval. It shows that you care enough about the science—and are confident enough in your own research—to have someone else see if it holds up.

That gives everyone a big incentive to make sure the same results are reproduced, but where's the incentive for finding results that differ from the original result? In the current system there's various reasons why an independent team of researchers would want to prove some incorrect published research wrong, but I don't see much of that translating to a company being paid by the people who most want the results to hold up.
posted by burnmp3s at 11:51 AM on August 14, 2012 [4 favorites]

Sounds like this would make private companies the arbiters of scientific accuracy. There's nothing wrong with corporations being a part of the enterprise of science, as they of course are and have been for some time, but why make them categorically the gatekeepers of legitimacy?
posted by clockzero at 11:53 AM on August 14, 2012 [6 favorites]

Shouldn't the journals do that work? They certainly fucking have the profit margins to do so.
posted by TwelveTwo at 11:55 AM on August 14, 2012 [9 favorites]

From the second link FAQ:

A board of advisers matches you up with a company with the experience and technology to do the job. You pay them to do the job...and they report back whether they got the same results.

This still seems like a potential conflict to me, as the for-hire companies doing the validation will still have far more incentive to corroborate your results (since you're paying for them) and very little upside to failing to reproduce the study, especially since, from the second link FAQ, "You only have to pay upon completion of the replication study by the provider,...". So if I disagree with the results of the reproducibility study on my work, I can lodge a complaint with the inititative and refuse to pay until my dispute has been resolved, while the company has already completed the work and would have to wait before seeing any money for it. It seems to me like this isn't really a suitable replacement for independent verification of results. Not to mention, 10% of the cost of the original study is not necessarily peanuts, especially for smaller labs that might not be swimming in extra research money.
posted by aiglet at 11:56 AM on August 14, 2012 [13 favorites]

You pay them to do the job...

This is only a possibility for industry research, which usually has a big pile o' dollars for commercialization. University research would not be able to do this. Besides, there is already a big incentive in most areas of biomedical research to attempt to replicate a new, potentially important finding. In some areas, it's almost required for acceptance.
posted by Mental Wimp at 11:59 AM on August 14, 2012 [1 favorite]

Shouldn't the journals do that work? They certainly fucking have the profit margins to do so.

I didn't think this was the case. Profit margins aside, I thought journals were supposed to be number checkers and to do a once or twice over to check for obvious errors or falsification. I did not think it was their role to replicate experimental results themselves.
posted by RolandOfEld at 11:59 AM on August 14, 2012 [1 favorite]

Aiglet has a good point. This really needs to be changed, so the payment for replication is done in advance of the results of the replication study being known. I think this isn't a bad idea overall, since there is so little incentive to do independent reproduction of studies as it is, but it has to be extremely carefully designed to avoid bias and corruption, or it's worthless.
posted by Mitrovarr at 12:07 PM on August 14, 2012

This is a good idea, but implementation needs to be fixed. As usual, insightful discussion on In the Pipeline.

Incidentally, most publications in science don't require external verification of lab results to (Expense, bias, difficulties of setting up some expts, timeliness of publication, dangers of being scooped etc).

One of the few that does is Organic Syntheses, and has been doing so for a long time. (Interesting history here).

"Org Syn" procedures are pretty much the gold standard for chemical reactions that work.
posted by lalochezia at 12:18 PM on August 14, 2012

What if we encourage journals to publish research replicating previously published research, so that scientists who wouldn't be paid by the original researchers could provide a potentially unbiased check and still receive the academic credit of publication?
posted by ChuraChura at 12:27 PM on August 14, 2012

Having worked with a local big-name college research department years ago, the "Outright fraud probably accounts for a small fraction of such failures" is wishful thinking. As long as the research dollars kept coming in, and published papers kept going out (no matter if right or not), nobody above the lab techs seemed to care.
posted by Old'n'Busted at 12:38 PM on August 14, 2012

ChuraChura: What if we encourage journals to publish research replicating previously published research, so that scientists who wouldn't be paid by the original researchers could provide a potentially unbiased check and still receive the academic credit of publication?

You still have to find someone to pay for it. There isn't much financial incentive to reproduce someone else's work (they'll still own the rights to any application, after all).

Also, it won't ever look as good from a career standpoint to reproduce other people's research as it would to conduct your own.
posted by Mitrovarr at 12:44 PM on August 14, 2012

No, this is a terrible, terrible idea. (I have some expertise in research funding) Here's why:

1. It will add an INSANE amount to university research expenses - that means either that a $250,000 NIH grant will suddenly be $150,000 for the actual research and $100,000 for the replication or else what used to be a $250,000 grant will need to be a $300,000 grant. Or else the schools can eat the cost - from where? Tuition? Slashing jobs?

2. It will steer research away from publicly funded places and into corporate hands even more than it is already. Why? Because here at Large Landgrant University, we have to pay everyone - even the lowliest dishwasher - a living wage with benefits. And we can't say "oh, Dr. Desperate Postdoc, you are willing to work for $25,000/year even though all the other postdocs make $40,000? Guess we'll hire you for that, then." No, we have to pay Dr. Postdoc the same as all the other postdocs - lord knows that's not lavish and people would love to make more, but at least it's sort of fair.

But if reproducibility costs are driving up the cost of research, there's a huge incentive for:

1. Reproducibility labs to be run as sweatshops in order to keep costs down and win contracts
2. The infrastructure of sweatshop labs will lure contracts away from universities - those sweatshop labs will be gunning for more contracts and will work hard to undercut universities.

If we want this "reproducibility" stamp on research, there needs to be a public source of revenue and public control over the reproducibility labs. Run the whole thing via the NIH as a jobs-creation program, maybe. Funded by, for preference, taxing the rich.
posted by Frowner at 12:54 PM on August 14, 2012 [13 favorites]

Also, it won't ever look as good from a career standpoint to reproduce other people's research as it would to conduct your own.

Why hasn't someone(s) decided to make their career as the fraudbusters that expose shitty science through careful lab work? Surely that would be worth a few book deals and tenure somewhere.

I wonder what hit rate (replication attempts per fraud/faults detected) would be necessary to support that kind of long-term effort. One dodgy experiment out of every 5 tested? Every 10? I also wonder how difficult it would be to come up with a system for determining which papers to review that got results within the necessary range.
posted by jsturgill at 12:54 PM on August 14, 2012

I've been on both sides of this issue. An experiment that I did in 2002 was replicated recently and thankfully the results turned out to confirm my earlier study. Replication of studies is generally fraught with difficulties in my field (behavioural ecology) since there are numerous ways where it could go wrong. On the other hand, in my current research I repeated a very famous experiment and did not get results that agreed with the earlier study. It's been hell trying to publish it; first I always get the what-is-novel-about-this-research question from reviewers and then since the earlier study is very famous there seems to be a lot less willingness in giving my results a chance. I don't think the model proposed by the Reproducibility people will work across the board, especially in cases where, as aiglet and Mental Wimp pointed out, 10% of the budget is still a huge amount, but atleast it emphasises the idea that reproducing experiments is a valid part of science.
posted by dhruva at 1:01 PM on August 14, 2012 [1 favorite]

Iorns estimates the bill for replication will be about 10 percent of the original research costs

How the heck is that supposed to work? The only way I can see it costing less to reproduce the result than to obtain the result in the first place, is if you just slavishly copy the experimental methods of the original lab, skipping the parts that turned out to be dead ends, and maybe borrow a bunch of their equipment too -- or anyway, duplicate it as exactly as possilbe so that the same experimental techniques and data analysis tools work. Once you do that, it is hardly "independent" verification. The lab might as well just repeat its own experiment -- which I would expect to be cheaper than paying someone else to do it, but still more than 10 %
posted by OnceUponATime at 1:04 PM on August 14, 2012

Somehow I think it timely and apropos to link to this Feynman lecture on Cargo Cult Science here in this thread.
posted by infini at 1:05 PM on August 14, 2012

The problem they are attempting to address is real but this approach will go nowhere. This is such a bad idea that one wonders if it's just a plan by this person to promote her scientific services directory.

No PI will want to spend tens of thousands of dollars paying a random company to replicate their own studies for some gold star. Think of it this way: Would you rather devote your finite resources to cutting edge science or replicating your own work?
posted by euphorb at 1:10 PM on August 14, 2012 [1 favorite]

jsturgill: Why hasn't someone(s) decided to make their career as the fraudbusters that expose shitty science through careful lab work? Surely that would be worth a few book deals and tenure somewhere.

A few reasons.

First of all, once again, who is going to pay for it? Government agencies and universities aren't huge fans of funding replication studies; it isn't glamorous and doesn't resonate with the taxpayers. Corporations don't see where the profit comes from if someone else already owns the discovery.

Secondly, finding a study that can't be replicated is rarely an exciting fraudbusting moment. Usually, it means that you get results that fail to show statistical significance, indicating the need for someone (preferably a third party) to replicate the study again... and after all that, usually what you'd find is that the results were either the results of bad luck (5% of studies will have a false-positive under the commonly used p>0.05 measure, and you are yourself vulnerable to a type II error to fail to show statistical significance when it exists), or forgivable errors (things like statistical mistakes, faulty equipment, bad data hygiene, etc. that anyone might do by accident). A good portion of the time, you will never figure out why the original study had a positive result.

Finally, it's really hard to get failure-to-replicate results accepted. People have a tendency to believe the first result, or disbelieve both results. It's even harder to get people to accept that fraud occurred; scientists are highly unwilling to see this kind of fault in their colleagues, and you have to prove it with absolutely irrefutable evidence.
posted by Mitrovarr at 1:11 PM on August 14, 2012 [2 favorites]

(And perhaps the reluctance of people to accept this sort of research, which is supposed to be built into the way we do science, points to the need for some cultural changes in the sciences)
posted by ChuraChura at 1:23 PM on August 14, 2012

Funded by, for preference, taxing the rich.

That doesn't make sense. Tax revenue is fungible, since money is fungible. Anything that's publicly funded is funded by taxes across different income levels, not just the rich.
posted by John Cohen at 1:32 PM on August 14, 2012

Reproducible knowledge is an important element of science. One way to do it is to repeat the same experiment twice. A better, more informative way of getting reproducible knowledge is by finding multiple kinds of experiments that point at the same underlying fact. This is usually what journals and reviewers insist on for surprising results.

There is, unfortunately, a finite amount of money to pay for research, and any amount spent on paying for-profit enterprises to repeat experiments identically is money that won't be spent on other avenues of looking for the facts, even the same facts. For that reason, I too think this is a horrible idea.

I would, however, be very much in favor of a strict requirement of reproducible analysis. For any published scientific result, anyone should be able to download an archive or virtual machine image that contains the original raw data and all scripts, programs, and other files necessary to get from that to all the figures, tables, and statistics used in the article.
posted by grouse at 1:37 PM on August 14, 2012 [3 favorites]

Sounds like the death of science as we know it. And I don't feel fine.
posted by Twang at 1:39 PM on August 14, 2012

grouse: You'd probably like the field I work in right now, phylogenetics - to publish a phylogenetic analysis, all of the genes must be placed in Genbank for public access, and the alignments, etc. must be placed in another online database, Treebase. Between that and the methods section of the paper you should be able to completely reproduce any analysis (sometimes even random number seeds are given, although this is not always feasible).
posted by Mitrovarr at 1:42 PM on August 14, 2012

They do have an impressive list of advisers.
posted by grouse at 1:45 PM on August 14, 2012

An article in Nature, "Independent labs to verify high-profile papers" makes it seem like this is not really intended for all research, only high-profile (and probably surprising) results. It might not be a horrible idea in certain circumstances.
posted by grouse at 1:51 PM on August 14, 2012

It would be nice if, all other goals aside, it could simply change the structure of science reporting from "Scientists discover faster than light particles: Einstein and everything we know wrong?" to "Scientists discover faster than light particles, claim being verified by XYZ." They'll certainly still screw it up, but maybe they will screw it up in ways that do not directly undermine confidence in the scientific method.
posted by feloniousmonk at 2:40 PM on August 14, 2012

It would be nice if, all other goals aside, it could simply change the structure of science reporting from "Scientists discover faster than light particles: Einstein and everything we know wrong?" to "Scientists discover faster than light particles, claim being verified by XYZ."

Even if this service had a 100 percent adoption rate, there is no chance of this change in reporting happening.
posted by grouse at 2:43 PM on August 14, 2012 [1 favorite]

One can hope.
posted by feloniousmonk at 2:44 PM on August 14, 2012

How about research labs having to be held financially responsible for the cost of reproducibility plus a large bounty/penalty if the findings are not reproduced? This is how it would work. Any lab anywhere - we can call them Bounty Labs - is allowed to take any study published in certain journals and attempt to reproduce it at their own cost. If they cannot, an independent Verification Commission investigates - if it transpires that indeed it is not possible to duplicate the study, then the research lab that originated the study has to pay the Bounty Lab the cost of the attempt to reproduce plus a bounty of X dollars; additionally, the original lab has to pay the Verification Commission for the cost of the verification investigation if the study indeed is found not to be reproducible (if it is reproducible, and the Bounty Lab messed up, then it is the Bounty Lab that pays the cost).

Basically, the cost of research would only rise slightly - because I suppose labs would have to become insured against fraudulent/incompent payouts, sort of like malpractice insurance. But, otherwise, the cost would be minimal and would be a substantial motivator for good research. Bounty Labs would scour the literature for iffy research - that's a huge service to science right there - the more bad research they find, the more money they make through bounties. There would be no incentive for Bounty Labs to falsely attack good studies, because they have to reproduce the study at their own (speculative) upfront cost and if it pans out, they lose the money they sank into reproducing - that keeps the Bounty Labs honest. It keeps the research labs honest because they know there are sharks out there circling the waters waiting to pounce on any bad research, and that any of their own lab techs who sees anything funny going on, can pick up the phone and alert a Bounty Lab and share in the bounty. The Verification Commission sees to it that there's no "he said / she said". Everybody wins./fantasy/
posted by VikingSword at 2:57 PM on August 14, 2012 [1 favorite]

I think it would be a great exercise for graduate students and advanced undergrads - you've got the methods, you've got the expected results, and all the theory behind it. Of course I understand that this is probably not a practical solution, but neither is having the original researchers pay some third party for replication.

It's a practical solution in some domains (particularly in the behavioral sciences, where many studies are one- or two-session with convenience populations), and where practical it's absolutely fantastic. I've participated in courses designed around this idea, and it's a great learning experience for all involved. (And more work for all involved, but the trade-off is a good one.) Graduate and most undergrad degrees in my field have a 'methods' course hanging around in them - I've reproduced rock-solid techniques from the 50s and I've worked on replications of data from last year, and I can tell you very clearly which one results in better student engagement and understanding of the scientific process.

I wonder what would happen if we had both outlets for the replication/nonreplication of existing results, and a tenure/promotion/granting system that recognizes the importance of this kind of work...call it something else than 'research' if you need to preserve the distinction, call it 'scientific service' or whatever you want, but get it into the system. Actually, maybe giving it its own name would help raise its profile? The thing is, labs are already doing a lot of this work (as people upthread have pointed out), because they have to. Doing the replication is often a necessary step to doing the 'next thing', because you use it as a check that your equipment and protocols are working (especially if you're working off of another lab's result.) In the best case, it (eventually!) works and you're off to the races...but it doesn't always get formalized or published in a coherent way. And if it never works after 6 months of tinkering, you're in the position of choosing between scrapping the project, and really-and-truly convincing yourself into the ground that the nonreplication is real (and you didn't just screw something simple/mysterious up), when there's no payoff in the current academic system for doing so.
posted by heyforfour at 2:57 PM on August 14, 2012 [2 favorites]

plus a large bounty/penalty if the findings are not reproduced?

Eh, not a fan, and I'd like to think it's more than just self interest :) Because of the way statistics work, there will always be good, rigorously conducted, well-motivated, properly-analyzed, incorrect results. Some nonreplications are fraud, some are honest human error, some are casualties of probability.

If your system were in place today, I think that it would cause most practicing scientists to refuse to publish anything that they didn't replicate in-house first. That's a very laudable goal, but Organic Syntheses's approach is probably a less punitive way to reach it.
posted by heyforfour at 3:04 PM on August 14, 2012

I think this snippet from that write-up in Nature tells you all you need to know

Elizabeth Iorns, chief executive of Science Exchange — a commercial online portal that matches scientists with experimental service providers — noticed that a number of drug companies were employing researchers to validate published results. It prompted her to develop the Reproducibility Initiative, a mechanism to replicate research results, with a particular focus on preclinical biological studies.
...
The Reproducibility Initiative will work through Science Exchange, which is based in Palo Alto, California.

posted by euphorb at 3:15 PM on August 14, 2012 [4 favorites]

If your system were in place today, I think that it would cause most practicing scientists to refuse to publish anything that they didn't replicate in-house first.

Well, it's probably not workable for all sorts of reasons, and also you're probably right. But. The culture of scientific research is not set in stone. Scientists do research in a variety of environments, with a variety of incentives - they've done research under Communism, for the government in free societies, and for private institutions for private gain large and small. It is my suspicion - nearing certainty - that if a new crop of graduate students were introduced to an environment like I outlined, they would quickly accept that as the default status - you would see no fewer scientists. People work under conditions that are presented, period. The bigger question is whether on the whole it would be of benefit to Science or not - would it result in a net increase of knowledge or if it would inhibit the process. The answer to that - the most crucial - question, I don't know. I can only speculate. So your speculation - that it would be a net negative - might be entirely right. But it might also lead to fewer but better studies. Given the flood of poorly designed studies, it seems tempting to me to cut down on the quantity if it results in higher quality (s/n ratio) - but who knows.
posted by VikingSword at 3:19 PM on August 14, 2012

There would be no incentive for Bounty Labs to falsely attack good studies, because they have to reproduce the study at their own (speculative) upfront cost and if it pans out, they lose the money they sank into reproducing - that keeps the Bounty Labs honest.

Whoa, this would be a terrible idea in fields that commonly use significance tests (e.g. psychology). You could collect bounties in this system by doing things to minimize the likelihood of a significant result- using a small enough sample size to hurt power, controlling the time at which you stop collecting data to ensure a non-significant result, or just inadequately control conditions to boost error variance.
posted by a snickering nuthatch at 3:22 PM on August 14, 2012

[...]You could collect bounties in this system by doing things to minimize the likelihood of a significant result[...]

Well, no. Because a Bounty Lab merely not being able to reproduce a study is not enough to get them the bounty - nobody takes their word for it "yep, we weren't able to reproduce, so pay up". Not being able to reproduce is just the first step. Then comes the next step - it goes to the independent Verification Commission which examines the study methodology used by the Bounty Lab, and is also able to hear a rejoinder from the original lab. And the Verification Commission decides if the study passes muster or not. Think of it like Police+Prosecution and the Judiciary. Bounty Labs are the cops and prosecutors who investigate and present evidence of a "crime" before the panel of judges (Verification Commission) and the original lab is allowed to defend themselves. The loser pays the costs - keeps both sides honest. The Bounty Lab pays the cost of investigation out of pocket and if wrong also for the Verification Commission, and collects nothing (if wrong). But if they are right, then the original lab pays for Bounty Lab's cost, the Verification Commission's cost and a penalty/bounty. The Bounty Labs would not do well to go on unfounded fishing expeditions, because they'd lose money.
posted by VikingSword at 3:36 PM on August 14, 2012 [1 favorite]

What a great idea! Except if you understand statistics. I expect many scientists to be in favour of this.
posted by srboisvert at 3:52 PM on August 14, 2012

I cannot help but think this bounty scheme rests on the assumption that there is a significant amount of bad work out there, or, at the very least, that results that can't be reproduced are thus either because the original researchers were committing fraud or were so incompetent that it's obvious the experiment should never have produced those results in the first place.

Fair enough - there is that assumption. If that assumption is wrong, then no harm no foul - what are we even discussing? What's the point of this FPP's approach? If this is not a significant problem, then we can have a big laugh and go home.

However, I do have an interest in this topic, and I have done several FPPs about it. For example:

Bombshell investigation reveals vast majority of landmark cancer studies cannot be replicated. In a shocking discovery, C. Glenn Begley, former researcher at Amgen Inc, and a team working with him, has found that 47 out of 53 so called "landmark" basic studies on cancer -- a high proportion of them from university labs -- cannot be replicated, with grim consequences for producing new medicines in the future. These were papers in top journals, from reputable labs, which achieved landmark status with frequent citations.

More global look at the problem - this FPP:

'Much of what medical researchers conclude in their studies is misleading, exaggerated, or flat-out wrong.' Dr. John P. A. Ioannidis, adjunct professor at Tufts University School of Medicine is a meta-researcher. 'He and his team have shown, again and again, and in many different ways, that much of what biomedical researchers conclude in published studies—conclusions that doctors keep in mind when they prescribe antibiotics or blood-pressure medication, or when they advise us to consume more fiber or less meat, or when they recommend surgery for heart disease or back pain—is misleading, exaggerated, and often flat-out wrong. He charges that as much as 90 percent of the published medical information that doctors rely on is flawed. His work has been widely accepted by the medical community; it has been published in the field’s top journals, where it is heavily cited; and he is a big draw at conferences.'

And you can hardly turn around and not bump into yet another expose of bad faith research done by various pharma companies.

Now, maybe all this is a lot of smoke and no fire, but I wonder. People who have done research in this field (Dr. Ioannidis, above) seem to think the situation is dire, so there's that.

If that's the case, surely that level of incompetence should fail peer review?

What peer review? How many studies have been even glanced at, let alone peer reviewed? From what I've seen a ton of studies are read all the way through, only by the authors - not even the editors who subsequently agree to publish them in various journals. You can say about peer review what Ghandi was supposed to have said about Christianity (paraphrased): "sounds like a good idea!".

Also, aren't there fields were there'd be no one qualified to sit on this 'verification commission'?

Absolutely. Three points. One - this proposal is not going to be a cure-all, inevitably there will be fields where it's not practical and indeed whole fields and classes of studies where this approach is completely not applicable (how do you replicate a 30 year observational study of 30,000 patients?). Two - the fields which are most crucial and of immediate importance are unlikely to be so abandoned or obscure that there are only 2 people to speak competently to the issue - and if the field is that obscure, then it is less crucial that we verify everything. Three - the incentive to examine such studies could actually train scientists in various fields by having them examine other studies in depth, so that problem - of competence - might find its own solution, by raising the overall level of competence.
posted by VikingSword at 4:34 PM on August 14, 2012

In a shocking discovery, C. Glenn Begley, former researcher at Amgen Inc, and a team working with him, has found that 47 out of 53 so called "landmark" basic studies on cancer -- a high proportion of them from university labs -- cannot be replicated, with grim consequences for producing new medicines in the future. These were papers in top journals, from reputable labs, which achieved landmark status with frequent citations.

Right -- but that doesn't indicate that the authors of those studies were fraudulent, or incompetent, or somehow deserve to be punished. That's the point.

That doesn't make non-reproducibility any less significant of a problem. But it does affect the means you should use to deal with it.
posted by escabeche at 5:52 PM on August 14, 2012 [5 favorites]

Right -- but that doesn't indicate that the authors of those studies were fraudulent, or incompetent, or somehow deserve to be punished. That's the point.

True. But perhaps they do need an incentive to publish studies which can be replicated, if that's the nature of the study, because, well, that's how science works. There are certainly studies where reproducibility is not the point. But we are addressing ourselves exclusively to a class of studies where reproducibility is at the heart of the methodology (hence the FPP). If you cannot have such studies reproduced, then what is the point of having them published? I can publish a study on cold fusion tomorrow, with the disclaimer that it cannot be replicated - what is the value of that? Perhaps it would be better not to publish such a study. The goal here would be to strengthen the methodology and description, which would have great benefits overall. If the study cannot be reproduced due to insufficient detail provided, but can be replicated once that information is available, then it is still possible to reproduce in principle, and therefore would be cleared by the Verification Commission, once additional information has been furnished. But if it cannot be replicated, no matter what, then it's a badly designed study and should have not been published; under this system, such studies would be disincentivised, and it is hard to see what is lost with such studies not being published.
posted by VikingSword at 6:08 PM on August 14, 2012

Why hasn't someone(s) decided to make their career as the fraudbusters that expose shitty science through careful lab work?

Because they usually wind up looking like bullies and buffoons?
posted by Kid Charlemagne at 6:36 PM on August 14, 2012

If you cannot have such studies reproduced, then what is the point of having them published?

How are you supposed to know whether they can be reproduced until another lab tries to reproduce them?
posted by escabeche at 6:45 PM on August 14, 2012

Here's my own personal story on this sort of thing.

1)I develop an assay where a protein composed of two different peptides (which I will call A and B) is captured by an antibody against peptide A and then bound by a labeled antibody against peptide B to show that the vast majority of our product is AB and not AA or BB. (This is absolutely not a bold new discovery here. These were off the shelf antibodies and a 40 year old technique.) Result - a micro-titer plate with absorbances corresponding to analyte concentration good precision, etc.

2) We sent it off to Austria. They ran it ten or so times. If you squinted just right you could kind of tell where their high standard and blank were...sometimes.

3) I got put on a plane. Sat there, jet lagged as hell, watching a lady who wasn't quite fluent in English (but was, thankfully about 20 times better at it than I am in German) perform what looked like a flawless assay using both her materials and reagents and plasticware that I shipped to Austria.

4-7) Repeated the same thing for four days only with me running a parallel assay, tweaking things left and right trying to get the assay to behave and all of our results looking just as crappy as the day I arrived.

So, obvious fraud, right? I mean we were paying them for this and I, soulless pawn of big pharma, was there actually helping them get "the right answer" and we still couldn't do it.

8) The next week I had her repeat the assay in a biosafety cabinet (which, as it turns out, was in a building where our compound had never been present). Her standard curve was as precise and had as low a correlation coefficient as anything I'd ever produced.
posted by Kid Charlemagne at 7:03 PM on August 14, 2012 [1 favorite]

OnceUponATime: Well, my answer is from chemistry, but in chem a huge chunk of the time is designing and testing the reaction. For example, I've spent the last 3 weeks trying to make a new compound. Nothing ground breaking in terms of synthesis, but tweaking each step to get it right, doing each step on a small scale first to make sure it works before committing chemicals, etc, takes time. If I was doing it again, using my notes to duplicate just the steps that worked, I could do it in well under half the time, probably less, and waste a lot less chemicals along the way. 10% does seem a bit low, but at the same time, if you really focused on fast turn over, damn the yeild, assembly line reactions...it might be possible.

Doubly so if the research group supplied the verification lab with precursors, protein, etc. Making the stuff you need to get to the interesting chemistry can take several weeks, and doing stuff other people have done a dozen times before isn't the part of the paper that needs verification.
posted by Canageek at 8:00 PM on August 14, 2012

How the heck is that supposed to work? The only way I can see it costing less to reproduce the result than to obtain the result in the first place, is if you just slavishly copy the experimental methods of the original lab, skipping the parts that turned out to be dead ends, and maybe borrow a bunch of their equipment too -- or anyway, duplicate it as exactly as possilbe so that the same experimental techniques and data analysis tools work.

Yeah, I think that's the point.

Once you do that, it is hardly "independent" verification. The lab might as well just repeat its own experiment

They are verifying that the experiment has the results they say it does, the point is to avoid dishonestly (or erroneously) claiming results that don't exist. Re-running the experiment might help with the errors, but won't help with the dishonesty.

Any lab anywhere - we can call them Bounty Labs - is allowed to take any study published in certain journals and attempt to reproduce it at their own cost. If they cannot, an independent Verification Commission investigates - if it transpires that indeed it is not possible to duplicate the study, then the research lab that originated the study has to pay the Bounty Lab the cost of the attempt to reproduce plus a bounty of X dollars

That just seems ridiculously complicated, and adds risk for no reason. For one thing, it makes costs impossible to calculate. If you're 95% sure your result is right, you can assume a 5% chance the result might not work. But you wouldn't be able to figure the cost that these 'bounty labs' might charge.

Plus, in the case of public sector work, generally it's funded by a fixed grant. So if you get $200k, you spend $200 on your research, and a bounty labs group finds a mistake, there won't be any grant money left to pay them anything.

Furthermore, how does that even fit with the voluntary framework? With this, the extra verification just gets you a little stamp on your paper. With your solution you would what? Sign a contract saying you'll pay this bounty if it turns out your wrong? How will people know if you'll be able to make good on the claim?

Your proposal might mean that people who do things properly pay less money, but it's so totally random that you end up with all kinds of extra risks that need to be priced in. It also won't even prove that the thing is replicatable, because some papers might never get replicated anyway.
posted by delmoi at 8:04 PM on August 14, 2012

In any field science replication is enormously costly. Things like climate change measurements, species surveys, ecosystem mapping, environmental contaminant measuring---to replicate the work, the study has to be duplicated entirely.

It's not even clear how to do that in many cases. If one study counts polar bears in a particular area, for instance, how does one repeat that? Does one go back the next season and hope that weather patterns, human pressures and prey species migrations are exactly the same? How would one account for differences?

I've been involved with (and run a couple) of a number of large survey projects. Sampling variations are enormously important. I had one project whose results changed year to year based on politics---a UN export ban meant a whack of my samples were unavailable some years.

However, even if that is possible to control, the expense of running a verification test, contracting dive teams or renting helicopters, lab work, analysis, is essentially identical to running the original study. That's sometimes worth it for things like climate change studies for example, but most of the time granting bodies aren't willing to duplicate work.

One thing that is often worth considering, however, is replicate analysis of a sampling program. It's common practice now to have a main lab for most of the samples, but to send a few out to a second lab for verification. This doesn't do anything for sampling variability, but it at least gives confidence in the lab results. For that study the UN monkeyed with, I had two labs doing replicates (double blind) and a third providing verification on 10% of the samples.

The gold standard for verification/validation studies, is round-robin testing. Five labs all doing the work independently is considered the minimum level of effort required---this is used for new method development, particularly for those scientific methods that become regulations ("measure this thing that way").

Replication while really useful some areas of study, is very hard and/or expensive to do well in others, and old hat in a few rare contexts. It's not blanket desirable in all science. Its worth should be judged selectively, by context and by practicality. Not every study published in Nature should or even could be duplicated.
posted by bonehead at 9:54 PM on August 14, 2012

LOL. This is a non-solution to a non-problem (in the vast majority).

The only way this might make sense is if a competitor wants to smear you and pays some lab to do it. In the academic setting, there's no way there's funds to do this.

I've been semi-intimately embroiled in a results controversy. A rival group actually ganged up and wrote a multi-lab paper that they submitted to a really low impact journal (this is a high impact field) to contradict one big career facet of one of the professors in our department. 3, 4 years later, he published a rebuttal paper that ... mostly ... defends his position. But his post doc who did the work for the rebuttal has given up defending his principle investigator.

The publication industry is messed up, yes, but because people involved at all the different levels aren't idiots it mostly self corrects. There are many better suggestions on publication reform that addresses real problems than what this initiative wants to tackle.

Kudos to them if they get a big old venture capital infusion, though.
posted by porpoise at 10:38 PM on August 14, 2012

Sorry - the controversy was the reproducibility of the results. In the case that they couldn't be reproduced. The rebuttal was demonstrating how the subject was prepared in a different manner, and then showed a molecular mechanism for why the different preparations yielded different results.
posted by porpoise at 10:40 PM on August 14, 2012

Gah, anyway, high impact research is self-governing. The peer review process is strong enough at the higher levels that bs will be called out; peer reviewers are peers - people doing similar work, and have probably run similar, if not "identical," experiments and know what their own results are.

If you're claiming something very different, the reviewers will ask you to doubly prove it with another level of evidence. Even if your results are "as expected," higher impact factor journal reviewers will ask for additional evidence using different methods, or additional experiments to prove that your experiments are yielding valid results. It's a bit ridiculous, but it's a 'secret handshake' PLUS 'do one job for us' verification.

One reason why impact factors are important, but then we run right back into how the rating system for impact factor is a mess.
posted by porpoise at 10:48 PM on August 14, 2012

Yeah, impact factor is really just a rough surrogate for how widely a paper is read. The more people read it and talk about it, the more likely someone is to actually go through the annoying slog of rebutting it (remember #arseniclife? yeah about that, [1] [2] [3]). There's also the (sort of self-selecting) correlation between journal sexiness and novelty/surprisingness of the conclusions - and of course surprising conclusions are more likely a priori to be wrong (c.f. Ioannidis). Put those factors together and even without invoking fraud or sloppiness it seems sort of unavoidable that fancy pants papers will end up having a higher retraction rate.
posted by en forme de poire at 6:05 AM on August 15, 2012

... if it transpires that indeed it is not possible to duplicate the study, then the research lab that originated the study has to pay the Bounty Lab the cost of the attempt to reproduce plus a bounty of X dollars; additionally, the original lab has to pay the Verification Commission for the cost of the verification investigation if the study indeed is found not to be reproducible (if it is reproducible, and the Bounty Lab messed up, then it is the Bounty Lab that pays the cost).

Please note that of all the study where the null hypothesis is true, using a p-value of 0.05 means that 1 out of 20 will show positive results that cannot be duplicated. This is a powerful incentive to not do studies where the outcome is not known in advance. In other words, you would kill science.
posted by Mental Wimp at 12:12 PM on August 15, 2012 [1 favorite]

From the link: Bombshell investigation reveals vast majority of landmark cancer studies cannot be replicated

The scientific community assumes that the claims in a preclinical study can be taken at face value," Begley and Lee Ellis of MD Anderson Cancer Center wrote in Nature. It assumes, too, that "the main message of the paper can be relied on ... Unfortunately, this is not always the case."

Neither of these two assertions is true. In fact, a great deal of scientific effort goes into proving that the previous findings are false. Falsification is at the basis of scientific theories and is the main stuff of scientific endeavor. Besides, who decided they were "landmark" studies, rather than ones that looked interesting because, if true, they could lead to big bucks for Amgen?

Another important point to be made: these were laboratory studies. This is exactly why such laboratory-based science needs to be tested in human trials before being recommended as interventions in the population. This is not new to anyone in the field.
posted by Mental Wimp at 12:22 PM on August 15, 2012

If your results aren't reproducable, they aren't valuable. Large amounts of money aren't being spent for social rank or ink on paper. They're spent to enhance our knowledge of the universe -- our predictive knowledge, quite specifically.

Fewer papers that are true, are vastly more important than more papers with irreproducable fever dreams. Every actual scientist should be ecstatic of the concept that an organized force for assigning credibility to reproducability is coming. There have been perverse incentives here and it's about time we see pushback against them.
posted by effugas at 3:00 PM on August 16, 2012

(That being said, I agree, the bounty concept is not good. Though 1 in 20 levels of confidence may not be high enough either.)
posted by effugas at 3:01 PM on August 16, 2012

In fact, a great deal of scientific effort goes into proving that the previous findings are false.

Falsification isn't reproduction. You falsify things you think are false. You reproduce things you think are true. If nobody's actually reproducing things we think are true, we never actually find out if things we think are true, are actually false.

To put it another way, there's some massive selection bias going on in terms of what's actually "put to the test".

Falsification is at the basis of scientific theories and is the main stuff of scientific endeavor.

It should be. But it hasn't been. It's not falsify or perish, or reproduce or perish. It's publish or perish. The system is optimized for getting new results out there, true or otherwise.

Besides, who decided they were "landmark" studies, rather than ones that looked interesting because, if true, they could lead to big bucks for Amgen?

Amgen's access to the actual researchers was contingent on them not identifying the papers. If there was a wide supply of reproduction going on we could trivially discount Amgen's work. But then there's noti much else competing, is there?

There is high demand for curing cancer, and why shouldn't Amgen make big bucks meeting that demand? Would you rather they make another pill for Erectile Disfunction (not that there's anything wrong with that)? Would you rather they avoid all (expensively funded) public research, and instead pursue homeopathic solutions? Absolutely, industry should be able to follow the lead of widely respected medical researchers.

If it turns out those researchers aren't actually doing useful work, what separates them from the homeopathists?
posted by effugas at 5:06 PM on August 16, 2012

In fact, a great deal of scientific effort goes into proving that the previous findings are false.

Falsification isn't reproduction. You falsify things you think are false. You reproduce things you think are true. If nobody's actually reproducing things we think are true, we never actually find out if things we think are true, are actually false.

This is incoherent. The attempt to falsify is an attempt to reproduce. You say agent X cures cancer, and I try to falsify that by trying to reproduce it. That's not the only kind of falsification going on in science, but it is a large part of the activity. Because a drug company eager to jump on a potentially profitable technology is stupid enough to believe the first paper suggesting a finding and discovers that they can't reproduce it (i.e., they falsify it), that does not mean there is something fundamentally wrong with the scientific enterprise. In fact, that's how it is supposed to work.
posted by Mental Wimp at 5:13 PM on August 16, 2012 [1 favorite]

Mental,

"You say agent X cures cancer, and I try to falsify that by trying to reproduce it." -- you're missing a step. What's actually happening here is:

"You say agent X cures cancer, and I don't believe you so I try to falsify that by trying to reproduce it and expecting the reproduction to fail."

Yes. That totally happens. What doesn't happen nearly enough is that you believe me and you try to reproduce my results anyway, just to be sure. In fact the system is set up quite thoroughly to discourage such behavior. And so what we've found is when someone actually did do a second phase of validation, was that the results are astoundingly bad.

Reproduction is part of how science is supposed to work. An initiative supporting reproduction should be seen as an excellent thing. Even if things were already good, this would be even better.
posted by effugas at 5:38 PM on August 16, 2012

Reproducible knowledge is an important element of science.

No, reproducible knowledge is science. Irreproducible knowledge is fiction.
posted by effugas at 10:54 PM on August 16, 2012 [1 favorite]

and reproducible fiction is franchise.
posted by TwelveTwo at 11:14 PM on August 16, 2012 [2 favorites]

"You say agent X cures cancer, and I try to falsify that by trying to reproduce it." -- you're missing a step. What's actually happening here is:

"You say agent X cures cancer, and I don't believe you so I try to falsify that by trying to reproduce it and expecting the reproduction to fail."

Nope. It doesn't matter whether you're saying "I don't believe you," or not, science is the process of taking a theory and it's predictions and trying to see if you can falsify them. If not, you accept the theory for the time being, as long as there isn't a simpler, still useful theory that also makes the same predictions. A good scientist should always be agnostic about what the result will be and design the experiment to fail if the theory isn't true. Beyond that, what you believe is immaterial to the process.
posted by Mental Wimp at 9:07 AM on August 17, 2012 [1 favorite]

Every actual scientist should be ecstatic of the concept that an organized force for assigning credibility to reproducability is coming.

Rather than get hung up on duplication, which can be impossible for many studies, as I've tried to explain above, I would much rather see efforts put toward confirmation.

For something like a synthesis or an analytical method, duplication, that is replication, doing the same steps in the same way, is how to show reproducibility. Simple. And guess what, surprise!, the EPA (for one) already does this for environmental test methods, to the extent of requiring a minimum of 5 replicates not just two.

The problem is, much of science can't be done that way. How do you duplicate a century of weather station data? Do you have a second Lucy in your back pocket? Or do you just want to give another arbitrary club for denialists to beat up climate change and evolution researchers with?

Much of large, costly, survey data isn't replicable because of time or sample limitations. One simply can't get the same samples or starting conditions to duplicate an experiment. A smart scientist will therefore do a confirmatory study rather than a replication, look at carbon sequestration by plants over the same historical record to give a second, comparable data set to the historical weather station data, or search for hominid species before and after Lucy. And, surprise!, this is already what happens in a lot of biological, geological and ecological science, it just doesn't get fancy gamified badges in flash journals.

Duplication is a particular solution for a relatively narrow set of problems, which seem to me to be largely confined to the drug and health sectors. It makes a lot of sense in those applications, and I applaud this initiative on those terms. However, to blindly apply these standards to other parts of science is trying to shoe not just goats, but fish and dinosaurs as well. Not relying on single studies, asking for confirmation is fine, and part of the usual scientific process. Blindly imposing an arbitrary, impossible standard, however, is not constructive or conducive to good science.
posted by bonehead at 9:28 AM on August 17, 2012 [3 favorites]

(and don't get me started on existing standards of practice that are prescriptive and procedurally-based rather than performance-based).
posted by bonehead at 9:38 AM on August 17, 2012

If it turns out those researchers aren't actually doing useful work, what separates them from the homeopathists?

Well, for starters, even an 11% rate of new, correct results is still [Err: Divide by zero] times better than homeopathy.
posted by en forme de poire at 12:33 PM on August 17, 2012

en forme--

Nobody's going to consider investing a few billion dollars into homeopathy. Remember, it's not that 11% of the results are useful, it's the you don't know *which* 11%.

bonehead--

You make an excellent point. Not all data is of the type that it can be duplicated; in that context, orthogonal confirmation is what we've got (and can be pretty compelling). Science shouldn't be barred from examining non-duplicatable datasets. That being said, I think the majority of applied science is fundamentally duplicative, i.e. "if you do X, Y will happen". Obviously climate change is an exception, but yeah, I don't exactly see a shortage of studies where the underlying experiment could have been replicated, but wasn't.

Mental Wimp--

A good scientist should always be agnostic about what the result will be and design the experiment to fail if the theory isn't true. Beyond that, what you believe is immaterial to the process.

You're coming dangerously close to a No True Scotsman fallacy. Experiments should be agnostically designed, but hypothesi never are. After all, a randomly selected hypothesis is likely to be false, and false in a way that does not illuminate the subject. The problem is that reproducability is not being seen as a confirmation of the original hypothesis; it's being seen as a new hypothesis ("this previous experiment is or isn't repeatable").

An unreproduced experiment is, in a very real way, a sample size of one.
posted by effugas at 5:24 PM on August 17, 2012

s/duplicative/duplicatable/
posted by effugas at 5:25 PM on August 17, 2012

Experiments should be agnostically designed, but hypothesi never are.

The plural is 'hypotheses'. Sorry, but 'hypothesi' is making me twitch.
posted by hoyland at 5:57 PM on August 17, 2012 [1 favorite]

hoyland-- oops :) Thanks for the correction.
posted by effugas at 12:20 AM on August 18, 2012

What doesn't happen nearly enough is that you believe me and you try to reproduce my results anyway, just to be sure.

If you strike "just to be sure" and replace it with "as a foundation for some of your own work", that happens all the time. I read a lot of journal articles in my time in big pharma, occasionally out of curiosity but mostly because I was looking for insight on a problem of my own.
posted by Kid Charlemagne at 11:42 AM on August 22, 2012 [2 favorites]

« Older I am the Hammer, they are the nails! | FAST food Newer »

This thread has been archived and is closed to new comments

MetaFilter

The Reproducibility Initiative
August 14, 2012 11:40 AM Subscribe

Tags

Share

The Reproducibility Initiative August 14, 2012 11:40 AM Subscribe

Tags

Share

The Reproducibility Initiative
August 14, 2012 11:40 AM Subscribe