There is no cost to getting things wrong
October 22, 2013 2:17 PM   Subscribe

Trouble at the lab: Scientists like to think of science as self-correcting. To an alarming degree, it is not
"Academic scientists readily acknowledge that they often get things wrong. But they also hold fast to the idea that these errors get corrected over time as other scientists try to take the work further. Evidence that many more dodgy results are published than are subsequently corrected or withdrawn calls that much-vaunted capacity for self-correction into question. There are errors in a lot more of the scientific papers being published, written about and acted on than anyone would normally suppose, or like to think."
posted by andoatnp (60 comments total) 34 users marked this as a favorite
 
A response.
posted by Artw at 2:24 PM on October 22, 2013 [11 favorites]


I'm so utterly unsurprised to see The Economist discuss academic work in terms which suggest that academics are mostly frauds whose outrageous folly isn't costing us enough. Because we're a bunch of free-loaders living the good life on taxpayer dollars while publishing nonsense.

I mean, it's as if scientists are taken seriously, paid well and rewarded with wide deference from policy-makers despite being wrong about all of the most important facts. Oh, wait, I'm describing economics!
posted by clockzero at 2:38 PM on October 22, 2013 [43 favorites]


Welp, time to abandon science then.
posted by cmoj at 2:38 PM on October 22, 2013 [7 favorites]


You really have to wonder why the Economist has such a high-brow reputation, given that they endorsed Bush in 2000. Talk about "no cost to getting things wrong."
posted by artichoke_enthusiast at 2:40 PM on October 22, 2013 [17 favorites]


On the one hand, there is a lot of solid discussion in this article of statistical issues that make inference about scientific problems very difficult, and deserved praise for things like the Replicability Initiative that aim to improve obvious problems.

On the other hand, the piece doesn't even try to make the case that science is not self-correcting, relative to other processes that claim to produce knowledge about the world. To paraphrase the old saying, it's the worst except for all the others.
posted by escabeche at 2:45 PM on October 22, 2013 [24 favorites]


Science is a tool, not a solution. People who write about science as if it is supposed to be an answer instead of a process, or better somehow than the people who use it, infuriate me. Also infuriating me today: people who believe that things happening "over time" means things happening "in their lifetimes."
posted by It's Raining Florence Henderson at 3:05 PM on October 22, 2013 [3 favorites]


This is such clumsy, ham-fisted, ignorant reportage aside from the loathsome implications of its initial claims and phrasing. For instance:

The idea that the same experiments always get the same results, no matter who performs them, is one of the cornerstones of science’s claim to objective truth.

That trait is called external reliability, and it is a feature of good empirical work, not a claim that "science" itself makes about its value in society.

In a larger sense, though, the thesis is incoherent because irreproducibility is itself a finding, and it demonstrates that the original findings may be false, which is something worth knowing. This is a part of what science is supposed to do. If we find lots of irreproducibility, then we know that we must continue evaluating claims carefully.

In the production of knowledge, extensive education, peer review and deliberative consideration of claims work very well, and that is true in part because nothing else is a good alternative. They work much better than any other program that has been attempted in empirical investigation. If there are lots of irreproducible findings in a universal and systemic sense, it's more likely due to neoliberal market pressures (of which The Economist is an enthusiastic handmaiden, let's not forget) distorting research and career advancement processes than fundamental flaws in the incentivization structure of the profession, to put it in terms the Economist's writers should be able to understand. Their agenda fully emerges at the end:

And scientists themselves, Dr Alberts insisted, “need to develop a value system where simply moving on from one’s mistakes without publicly acknowledging them severely damages, rather than protects, a scientific reputation.” This will not be easy. But if science is to stay on its tracks, and be worthy of the trust so widely invested in it, it may be necessary.

Dr. Alberts is not quite right; making mistakes should certainly be discouraged, as his specific recommendations suggest, but the ultimate goal is a knowledge-production system that gives us reliable facts and findings, not an academic value system that "damages" the right reputations. The conclusion itself reads like neoliberal propaganda about a manufactured need to condescendingly hold scientists accountable to an unreasonable and irrational standard in return for maintaining funding levels. People should not be punished for making mistakes in research, that's utterly outrageous and contrary to the basic idea of the scientific process. They should be punished for lying or misrepresenting, but not for having been proven wrong. That regime would stop serious review completely. It's only when scientists and researchers aren't finding lots of mistakes that we should conclude that something is wrong.
posted by clockzero at 3:08 PM on October 22, 2013 [13 favorites]


Here's a nice example of where it was self correcting recently: http://narrative.ly/pieces-of-mind/nick-brown-smelled-bull/
posted by Joe Chip at 3:08 PM on October 22, 2013 [1 favorite]


The Economist isn't a very good source of analysis, but the basic issues and questions surrounding the way science is actually practiced is interesting and relevant. There exists thoughtful and valid criticism (some written by scientists themselves) of how science as an industry is not free of problematic aspects. Practicing scientists do not generally spend their time working out these meta-concerns, but the epistemological stance that modern science culture seems to adopt—e.g., what really is meant by "self-correcting", at what scale, etc.?—makes it all the more reason that these conversations are important.
posted by polymodus at 3:29 PM on October 22, 2013 [6 favorites]



I'm so utterly unsurprised to see The Economist discuss academic work in terms which suggest that academics are mostly frauds whose outrageous folly isn't costing us enough. Because we're a bunch of free-loaders living the good life on taxpayer dollars while publishing nonsense.


Note the article concentrates on corporate science. Give The Economist some credit and the Devil his due.
posted by ocschwar at 3:31 PM on October 22, 2013


This fits with another line of evidence suggesting that a lot of scientific research is poorly thought through, or executed, or both. The peer-reviewers at a journal like Nature provide editors with opinions on a paper’s novelty and significance as well as its shortcomings. But some new journals—PLoS One, published by the not-for-profit Public Library of Science, was the pioneer—make a point of being less picky. These “minimal-threshold” journals, which are online-only, seek to publish as much science as possible, rather than to pick out the best. They thus ask their peer reviewers only if a paper is methodologically sound. Remarkably, almost half the submissions to PLoS One are rejected for failing to clear that seemingly low bar.
Oddly enough, given a random paper in PLoS One and a random paper from Nature, I'd be slightly more inclined to trust what's going on in the PLoS One paper, as there is less incentive for fraud.

I think the fundamental flaw of this piece is that it treats scientific manuscripts as little parcels of truth. I don't know any scientists that treat them that way, everybody I know treats them as hints at the truth. So when doing background research, one goes and looks at what's out there, and if you see a trend of repeated themes from different research groups, then that is getting towards "truth". And if you see other research groups coming to similar conclusions as your own when you write a paper, you cite them both to bolster your case and to connect their paper to yours to leave a trail for future researchers.

And I was also always taught that the conclusions in papers are not what's valuable. In most scientific papers what's valuable is the data itself. Interpretations, which is what gets treated as the revealed "truth" from a science paper, will come and go as we come up with new theories and reinterpret what the data is telling us. But in the end, it's only the data that is likely to make a lasting contribution. This is also less important in fields where most of the interesting hypotheses have been wrung to their death, and there's only a few bits left to fill in of a nearly complete picture. But I tend not to be too interested in nearly finished fields, it's far more interesting to work on the unknown to me.

So supposing that, say, 80% of papers are "false" in that the conclusions they draw are completely inappropriate, and will never be validated or repeated. In such a situation, we can start to use citations to find out if others found them useful or replicated their results and added to them. If a paper is in a high profile journal like Nature it may get cited simply because the paper's publicity influenced the way people think in the field. If a paper is in PLoS One and gets cited its most likely because others are coming to similar conclusions and the PLoS One paper was found through a search of some sort. These citations act as a second level of review, a 'field' review on top of peer review at publication time. A wrong, uninteresting paper will get no citations. A paper with a fair number of citations is either right or interesting, or perhaps even that rare combination of right and interesting.

And this is how it actually works in science. When you come into a new area, you start with reading review articles, where somebody familiar with the research area has actually done a lot of the work to look at through many papers and found the recurring themes. From there, you can get oriented enough to start finding hints of truth elsewhere. This is not like high school physics, where there are clearly correct and clearly wrong answers, it's actual, real, science, which is all about feeling around in the dark and turning things over until you've figured out what it is that's in your hands. We are working on trying to see, and while (at least in biology) there's often a tempting "conclusion" figure that serves as a model upon which to drape bits of data, and as a hypothesis for future research, it's probably not the "truth." At least it's not the truth until your press department gets their hands on it...

Compare a body of scientific literature where 80% of things are false but data gets published within a few years, to a body of scientific literature where 5% of things are false but data only gets published within a few decades, and the 80% false scientific literature is far far more useful, and accelerates discovery far more. It's also on a human-sized time scale, that lets people get acknowledged for work they have performed.

On the positive side of this essay, it's important for people to understand that peer-review in a respectable journal is not an adequate truth filter. But overall, this essay rates a solid C. Lots of interesting data points, lots of good discussion (I'm a huge fan of Ionaddis, but the sky-is-falling schtick for getting attention is getting old), but mostly unsupported conclusions and link bait.
posted by Llama-Lime at 3:35 PM on October 22, 2013 [22 favorites]


Here's a nice example of where it was self correcting recently: http://narrative.ly/pieces-of-mind/nick-brown-smelled-bull/

This is an excellent example of how false ideas get embedded in the hierarchy of a discipline and become particularly difficult to remove. The role of editors, journals, and orthodoxy are all nicely illustrated in the efforts of three outsiders to uncover an at best mistaken and most likely fraudulent theory that enamored a lot of psychologists. My impression is that this is not an isolated example in psychology or in many other scientific areas.
posted by Mental Wimp at 3:44 PM on October 22, 2013 [2 favorites]


Llama-Lime's comment is important. Papers are units of the ongoing conversation in a particular field, and it's the direction of this conversation that hopefully tends towards the right answer, not the individual papers themselves. A lot of non-scientists, even those writing articles about how science should behave, tend not to realize the fluidity of that conversation.
posted by kiltedtaco at 3:54 PM on October 22, 2013 [5 favorites]


Alhazen is dead!
posted by Apocryphon at 4:04 PM on October 22, 2013


I think the fundamental flaw of this piece is that it treats scientific manuscripts as little parcels of truth. I don't know any scientists that treat them that way, everybody I know treats them as hints at the truth.

This is a great point that I think gets missed a lot. You see it even in "smart" discussion forums (like here) where people get a hardon for Peer Review and tend to uncritically parrot anything from PubMed that supports their argument. Whereas in my office we generally assume any single paper is probably wrong and are much more interested in the "how" and the "why" until a proper review paper is written to sort everything out. It can be hard to convince people that contradictory results are not a fundamental flaw in the scientific method, though.

it's important for people to understand that peer-review in a respectable journal is not an adequate truth filter.

Agreed. I like to think of peer review as a minimum threshold rather than a stamp of approval from the scientific community.
posted by no regrets, coyote at 4:08 PM on October 22, 2013 [8 favorites]


This kind of essay is the direct result of being raised to believe the universe owes you absolute truths. That you are owed the truth or the belief that some single truth even exists is bizarre and dangerous kind of privilege common to the financially comfortable and people who tend to stay awake at night worrying. Whats more its a gateway belief and many go on to experiment with more dangerous beliefs like the One Right Way, Common Sense Solutions, Market Driven Solutions or even the existance of a benevolent and all knowing God who will take care of you if you're good. And finally the belief that there are humans on this planet that know more about these absolute truths than you do and that you should follow them based on faith.

As a scientist I live in a world where things are unknowably complex and there are no final answers to anything. I know I'll die long before we figure out half the stuff I work on and then it'll turn out there was more to it anyway and my contributions will become unimportant or forgotten. That may sound scary or pointless compared to seeking well ordered and unimpeachable truths but since those don't exist its what we've got.

Besides economists can talk. At least my graphs have units on the axis.
posted by fshgrl at 4:29 PM on October 22, 2013 [18 favorites]


Oh man, the final nail into the coffin of science! It's been a long time coming, and we've all been rooting for it, and finally The Economist has saved the day!

Go to business school and eat more refined carbohydrates sheeple!

How come there's no author attached to that article? Is that a done thing?
posted by turbid dahlia at 4:30 PM on October 22, 2013 [3 favorites]


How come there's no author attached to that article? Is that a done thing?

The Economist never lists authors.
posted by dirigibleman at 4:37 PM on October 22, 2013


Interesting. Very...interesting.
posted by turbid dahlia at 4:39 PM on October 22, 2013


It can be hard to convince people that contradictory results are not a fundamental flaw in the scientific method, though.

Thinking about this a little bit, I can see where a lot of the confusion comes from. From elementary school up through high school students are told about this thing called "the scientific method", where everything starts with a "hypothesis", then an experiment happens, and then you know what the right answer is. We basically straight-up tell them this is how science works, but this is just so far removed from how science actually operates that it seems more like a fairy tale. I guess it's just hard to convey even with very smart people working very hard, we spend a ton of time in various long-term misunderstandings of the problem we're working on, many of which get published and are actually important contributions to eventually remedying that misunderstanding. And this process is actually the best way to get to the right answer! This process is not just antithetical to the "scientific method", it's antithetical to picture we give in math and science classes where here's a problem (a weird derivative, a chemical reaction, a mechanics problem), here's the tool that solves that problem. Turns out the business of science is much more here's a problem, you don't know it yet but that's not even the right problem, nobody's ever had that problem before so you can't ask anyone, and you have to invent the tool that solves that problem because that's the whole point.



At least my graphs have units on the axis.
UGHHH. THIS. A THOUSAND TIMES THIS.
posted by kiltedtaco at 5:18 PM on October 22, 2013 [7 favorites]


So supposing that, say, 80% of papers are "false" in that the conclusions they draw are completely inappropriate, and will never be validated or repeated. In such a situation, we can start to use citations to find out if others found them useful or replicated their results and added to them...These citations act as a second level of review, a 'field' review on top of peer review at publication time. A wrong, uninteresting paper will get no citations. A paper with a fair number of citations is either right or interesting, or perhaps even that rare combination of right and interesting.

But....isn't the point of stuff like the Amgen study, mentioned at the top of the Economist piece? The papers they tried to replicate are described as "landmark studies of the basic science of cancer" --- my assumption would be that would mean they were widely cited? And yet only 6 were reproducible. A red herring doesn't stink less because three bloodhounds follow its trail, instead of one.
posted by Diablevert at 5:22 PM on October 22, 2013


Weird that so many folks seem to be attacking the source on this. We dealt with many of the statistical claims here.
posted by klangklangston at 5:30 PM on October 22, 2013 [2 favorites]


This kind of essay is the direct result of being raised to believe the universe owes you absolute truths. That you are owed the truth or the belief that some single truth even exists is bizarre and dangerous kind of privilege common to the financially comfortable and people who tend to stay awake at night worrying.

And here I just thought that it'd be nice to be sure that this $20,000 a pop experimental cancer treatment is actually more effective than sugar pills.
posted by Diablevert at 5:30 PM on October 22, 2013 [1 favorite]


Diablevert, it's worth noting that the institutions that invested the time/effort/money to try to fully replicate those landmark experiments are, in fact, pharma companies.

One of the things I learned pretty early on when I made the jump from academia to pharma was that the pressure to get the science right was orders of magnitude higher in pharma, because people's health and also major business outcomes depend on it. In other words, in pharma, shit gets real, and the whole enterprise can rise or fall based on how well the science is done.
posted by Sublimity at 5:44 PM on October 22, 2013 [2 favorites]




Diablevert, it's worth noting that the institutions that invested the time/effort/money to try to fully replicate those landmark experiments are, in fact, pharma companies.

Okay...but, what's worth noting about it? Like, is it your feeling that the incentives for pharma companies to do this stuff are sufficient that the problems discussed by the Economist are not in fact problems? Or is it that pharma companies aren't really objective arbiters of pure research and so they're unsuited for this role? Either way....53 landmark studies, 6 replicable. 'Cause after a little googling, it looks like the Amgen study came out last year? So...did these incentives to verify only kick in in 2012? Are there a bunch of other "landmark" studies which are similarly unreliable? Are 88% of all "landmark" studies irreplicable? Is anybody checking on that? Because it seems like the answer might be no, because it costs a lot to do?

Sorry for all the questions, I'm beginning to sound like a Valley girl. It's just...I saw this economist story last week, and I was dead curious to see what some actual scientists would say in a discussion of it on the blue. I had anticipated, wrongly, as it turns out, that a discussion of the points raised by the piece might turn on some of the questions I just mentioned above. Instead...well, but a rough count in a thread of twenty-odd comments there's about a half dozen Ad Homs on the economist, another half dozen complaints that laypeople don't understand science and shouldn't comment on it, and one or two suggesting that the self-checking process is fine and working exactly as intended, nothing to see here, move along. Plus one or two that concede the article might have a point or two.

But: 53 landmark studies. That seems like a real problem. Why isn't a real problem, if it isn't? And if it is, what's being done to fix it? Because while on a philosophical level I don't mind accepting a bit of existential angst over the ultimate unknowability of our lives and fates, I'd still prefer that the drugs I take and the social policies that my politicians and educators and insurance companies are trying out on me were based on actual facts...
posted by Diablevert at 6:22 PM on October 22, 2013 [4 favorites]


To paraphrase Bjarne Stroustrup:

Science is a human activity; forget that and all is lost.
posted by and for no one at 6:24 PM on October 22, 2013 [1 favorite]


But....isn't the point of stuff like the Amgen study, mentioned at the top of the Economist piece? The papers they tried to replicate are described as "landmark studies of the basic science of cancer" --- my assumption would be that would mean they were widely cited? And yet only 6 were reproducible. A red herring doesn't stink less because three bloodhounds follow its trail, instead of one.
First off, I was trying to be extremely clear about the distinction between interesting and correct, and how either one could result in citation. It turns out that the Amgen scientists were only testing the interesting papers, as that is what they considered to be "landmark," because they changed the realm of plausible scientific hypotheses. Further, many of these "landmark" studies are landmark only in the view of the Amgen scientists, because some had citation counts that were well below the average in the journal in which they were published.

But more importantly, the Amgen editorial is not a study, as it does not identify which papers they attempted to replicate, or even precisely what it meant to "replicate," which means different things to different people. The editorial is just a general wish that things should be different than the way they panned out, and that's great fodder for an editorial. But if you're going to try to turn an editorial into "data" or even, god forbid, "truth," one has to very carefully find out what the Amgen scientists did and what they found. In particular, this sentence:
Additional models were also used in the validation, because to drive a drug-development programme it is essential that findings are sufficiently robust and applicable beyond the one narrow experimental model that may have been enough for publication.
indicates a perfect symptom of the treating papers as isolated found truths. This is not an attempt at replication, it's an attempt at generalization. And there is no particular reason to believe that a cancer discovery in one model will generalize to many more cell lines beyond the ones tested, or the particular model systems. Because we've discovered in the past years that one of the major, defining aspects of cancer is that it's not one thing, it's thousands upon thousands of things. So how one model system operates is not relevant for all human cancers, but it's probably relevant for some fraction of them.

The Amgen scientists want cancer to be one thing, because then when they design a drug they can hit all cancers rather than the small percentage where that particular finding was crucial. And that's what they tested for. But that's almost certainly not what those papers were concluding, they were at best hinting and waggling their eyebrows about enticing new possibilities. Interesting possibilities. And in the time period where Amgen performed follow-on studies to these 53 studies, they can't be blamed for thinking that that's a reasonable bar for reproducibility, because uniformity of cancers was certainly a prevailing hypothesis. Do not needlessly multiply entities, and all that.

So from the pharma point of view, they want to know not only the existence of a particular cancer mechanism, they want to know prevalence. And that's a great thing to want, and most labs would want their discoveries to come prepackaged ready to turn into drugs that help people as soon as possible. However, it's far beyond the financial and human-power means of most labs, and most labs are aiming to advance discovery, not provide direct fodder for commercialization and application.

But all of this meaning is lost when translated for the press. Instead, we get "study finds 89% of cancer research is false." Incidentally, the editorial cites a Bayer report of only pursuing targets from papers 25% of the time, a higher percentage than Amgen found, and they didn't even call it 'reproducibility.' In those 75% of the time there are a multitude of reasons to end up in Bayer's trash bin: 0) trash science e.g. mislabeled tubes and samples, hung-over grad students, fraud, etc. 1) low prevalence of mechanism, 2) the strongest effects, are the ones that get reported but least likely to be reproducible (they call it the winner's curse in GWAS), 3) subtly different conditions that we don't yet know about (humidity, critical temperatures, timing, epigenetic switches that change behavior), but will discover in some years are critical parts of the system, 4) etc. Really, only 0) is a reason for not publishing the paper, and I would doubt that it's a significant fraction of Bayer's 75% or Amgen's 89%. And many "landmark" papers that are not generally true are required in order to get the right hypotheses in focus. Of course, this is only my personal most plausible interpretation of the limited data that the Amgen scientists presented. It could be that everybody's just winging it at an 89% chance, or that statistical properties of the nature of science, but given what we know now, and the "one true drug" attitude of pharma, misinterpretation of cancer heterogeneity seems far more plausible. But we won't know unless Amgen spills the beans.

So to cite the Amgen editorial as evidence that there are problems seems to get it somewhat wrong. Amgen wants the field to set the bar for publication at "this is valid for a great percentage of cancers" rather than "this is what I did and it worked in my model system that I have at hand." That is a point of view that deserves serious consideration, but it's a far different viewpoint from saying that there's "Trouble at the lab" or that there's rampant misconduct.
posted by Llama-Lime at 6:26 PM on October 22, 2013 [14 favorites]


That was an awful lot of word salad to say a simple point: Amgen defines "reproducibility" as "able to drive a drug program." In contrast, cancer researchers follow a simpler and lower bar that most scientists follow: if you use the same protocols you will get the same result.

Therefore, the 53 papers editorial is more a comment on the utility of a random paper for a pharma, and a plea to make cancer research more useful for pharma, and should not be construed as a claim that 47 of 53 "landmark" papers are useless.
posted by Llama-Lime at 6:44 PM on October 22, 2013 [8 favorites]


Whoa, that's one hell of a catch to embed in a table caption!

To address the comment "laypeople don't understand science", Llama-Lime's comment does a good job of reminding me that particular conversations in the literature can be just as easily misunderstood by scientists who aren't active in that field (uh, like me) as they can be misunderstood by "laypeople". It's pretty fascinating to see how other fields operate.
posted by kiltedtaco at 6:58 PM on October 22, 2013


The Economist's eponymous and famously dismal science has quite a history of failed experiments whose replication would be unwise.
posted by islander at 7:02 PM on October 22, 2013


So to cite the Amgen editorial as evidence that there are problems seems to get it somewhat wrong. Amgen wants the field to set the bar for publication at "this is valid for a great percentage of cancers" rather than "this is what I did and it worked in my model system that I have at hand." That is a point of view that deserves serious consideration, but it's a far different viewpoint from saying that there's "Trouble at the lab" or that there's rampant misconduct.

I don't think the Economist piece asserts that the core problem "rampant misconduct" -- the mention it as a possible unknown contributing factor, citing that one study that said a quarter of scientists know other scientist that they think fudge their data. From my reading, the core issue they point to seems to be that there's tons of incentives to come up with an unlikely hypothesis that generates a novel result and get it published, and not very many incentives --- in fact some substantial disincentives --- to go back and check those results. (If any psychologist ever does figure out how priming works, maybe they can figure out a way to prevent otherwise intelligent people from reading an entire article and ignoring all its nuances because they disagree with a glib headline....headline editors cause more trouble in this vale of tears.)

But back to the bigger point...I mean....saying "My hypothesis wasn't wrong. My hypothesis was right in certain very narrow experimental conditions which prevail in my lab and may have nothing to do with the real world," seems, to this layperson, to be close enough to wrong for government work. I get the distinction you're making. But to me that reinforces the suggestion that this might be a systemic problem with the incentive structures in the way we currently do science...Cancer is really, really hard nut to crack. So are a lot of the problems were interested in, at the frontier, it seems to me now. Very complicated problems with multiple causes where it's tough to pull out a single strand and point to it. So should the threshold for publication in some of the most prestigious journals in the field really be merely "an interesting result"? Shouldn't it be stronger? And if the way labs are set up and the way science is funded makes it too difficult for people to wait around until they have something stronger, isn't that a problem?

I mean, it's not that I think all scientists should really be engineers, that there isn't a need for pure research. But surely coming up with the right answer to a question ought to be more greatly valued than merely coming up with an interesting question....The point is to find a new true thing about the world.
posted by Diablevert at 7:08 PM on October 22, 2013


At first, I was outraged by the title of this week's Economist: "How Science Goes Wrong?" Not another reactionary piece against science that conflicts with someone's political beliefs. After calming down and reading the article, I realized the point they were trying to make was not about science per se, but about meta-science -- science as a social construct. How science is funded, researchers promoted, etc.

The mechanism by which science is funded and conducted has changed dramatically over the past 150 years from a group of rich men writing letters to each other to a professional class funded by one of a handful of governmental or corporate organizations and whose careers are contingent upon publication. The current cycle of publication and grants has been in place only for the past 60 years, and the Economist, as believers in the "market economy", rightly asks: do the incentives for scientists produce the optimum amounts of science?

The Economist's article points to well known problems: the tough competition for jobs in academia (ask me, I just got a PhD), the obligation to "publish or perish", and the bias that can creep in by trying to publish groundbreaking results. Their conclusions are the same ones many scientists have been trying to advance for years:
  • Publish negative results so meta-studies are not biased (e.g., combat the File Drawer effect)
  • Encourage replication of results. Their point is science is only self-correcting if someone goes and checks. If the system does not reward scientists for replicating others work, wrong knowledge will persist in the literature.
  • Encourage "normal science" -- science working within the paradigm to map out the unknown. Not all research needs to be groundbreaking. If the blind men feeling out the elephant keep publishing papers breathlessly declaring it a "snake" and then a "tree", they lose the opportunity to compare their raw findings and discover the elephant.
More radical ideas like post-publication review could be quite liberating. I was dropped into a field without a guide as a young graduate student and waded through mounds of bad, wrong, or just silly results. Having some hint as to the quality paper, even if occasionally wrong, would have been incredibly helpful. (And no, citation count is not a good measure for quality.)

As the Economist aims squarely at a politicians and executives (look at their ads), this article could be of some use, despite some of the mis-characterizations of the scientific process Llama-Lime points out, because their conclusions point in the directions that meta-science should change.


Incentivize that scientist!
posted by eigenman at 7:11 PM on October 22, 2013 [2 favorites]


The Nature article re Amgen's "findings" have been discussed on Metafilter before. Note that the published article now has a Clarification nailed to it that is relevant to the present discussion. Basically, it is anecdotal because the researchers could not release the data to those that asked for it. In other words, their own research falied the reproducibility test.
posted by dmayhood at 8:36 PM on October 22, 2013 [1 favorite]


The Nature article re Amgen's "findings" have been discussed on Metafilter before. Note that the published article now has a Clarification nailed to it that is relevant to the present discussion. Basically, it is anecdotal because the researchers could not release the data to those that asked for it. In other words, their own research falied the reproducibility test.

The clarification states that many of the researchers made the Amgen guys sign confidentiality agreements before letting them have a peek at the data, a problem the Economist article also talks about --- that reproducibility is tough to achieve because people often don't document experiments thoroughly and it's tough to get access to the raw original data. The Amgen guys could be wrong, of course. But even if one can't get a list of the exact same 53 studies they looked at, it seems to me like you could do a parralell study of your own "landmark" set....has anything like this been done anywhere?
posted by Diablevert at 9:00 PM on October 22, 2013


An interesting point that would add on to the arguments from the economist and elsewhere is what psychologist Paul Meehl called the "crud factor", also known as ambient correlational noise.
http://www.tc.umn.edu/~pemeehl/144WhySummaries.pdf

While analyzing an extremely large data set (extensive demographic data of every high school student in Minnesota, or something along those lines), Meehl noticed that, in addition to the expected correlations between variables (e.g., high school GPA and household income), there were correlations among variables with no clear relation. The latter were statistically significant (this is expected and almost inevitable with a large sample size) and in many cases had non-trivial effect sizes.

Meehl provides a strong argument for recognizing where possible, and adjusting against this "crud factor"--relatively small but non-trivial correlations between variables, resulting from obscure multivariate causal chains.
posted by patrickdbyers at 10:26 PM on October 22, 2013


The Economist's article points to well known problems: the tough competition for jobs in academia (ask me, I just got a PhD), the obligation to "publish or perish", and the bias that can creep in by trying to publish groundbreaking results. Their conclusions are the same ones many scientists have been trying to advance for years:
Publish negative results so meta-studies are not biased (e.g., combat the File Drawer effect)
Encourage replication of results. Their point is science is only self-correcting if someone goes and checks. If the system does not reward scientists for replicating others work, wrong knowledge will persist in the literature.
Encourage "normal science" -- science working within the paradigm to map out the unknown. Not all research needs to be groundbreaking. If the blind men feeling out the elephant keep publishing papers breathlessly declaring it a "snake" and then a "tree", they lose the opportunity to compare their raw findings and discover the elephant.


Exactly, the point isn't to justify or defend the way scientific enterprise is run now, which upon consideration would be a conservative stance to take; it's to ask in what ways scientific work could be done better, i.e., through processes and structures that better reflect science's own understanding of its principles or ideals. And that's a more open ended and harder question to answer—many ideas, none easy to put into practice.
posted by polymodus at 10:46 PM on October 22, 2013


I can see where a lot of the confusion comes from. From elementary school up through high school students are told about this thing called "the scientific method", where everything starts with a "hypothesis", then an experiment happens, and then you know what the right answer is.

I've long advocated putting Kuhn on the grade 6 curriculum, but for some reason I keep getting pushback.
posted by no regrets, coyote at 11:08 PM on October 22, 2013 [4 favorites]


So supposing that, say, 80% of papers are "false" in that the conclusions they draw are completely inappropriate, and will never be validated or repeated.

Supposing? This actually happened.

If 80% of the things you bought from eBay never arrived, you'd stop shopping at eBay.

There's a reproducability crisis going on. Hating on the Economist doesn't obscure that fact.
posted by effugas at 11:15 PM on October 22, 2013


No, it didn't actually happen (for reference, please see this thread). There may be a reproducibility crisis, but the Economist doesn't provide any evidence, and neither does that Reuter's article.

It's absolutely hilarious that a brief provocative editorial, summarized by newspaper articles, repeated in a game of telephone, is absorbed as truth and fact, when in fact nobody actually knows what was performed. And the actual editorial hides a pretty blatant redefinition of "reproducible" into "prepackaged commercial viability."

Please let that sink in. Instead of relying on science, you're relying on third-hand hearsay.
posted by Llama-Lime at 11:27 PM on October 22, 2013 [2 favorites]


Llama,

Did you see the recent example of the story where Oreos were found to be as addictive as cocaine? Truly dreadful science. You have to ask what pressures exist that would cause such out-and-out crap to make major news, and honestly admit those pressures are not isolated.

There's a lot of fantastic work being done. No question. But there are issues and they're going to need to be addressed head on.
posted by effugas at 12:09 AM on October 23, 2013


It's some serious goalpost shifting from a reproducibility crisis, but I'll play along. Of course I heard about Oreos and cocaine, it's all that any scientist can talk about these days! In other words, no, I haven't heard of that, don't be ridiculous. I wouldn't have heard of it because it's only tangentially related to science.

Google shows a blog post which references a university press release. It appears to be the project of two undergraduate students, and has not been published in a journal, and the press release couches them as preliminary results. So, where's the dreadful science? Was this accepted by a scientific community without proper review? Did the undergraduate students falsify their data? Did they accidentally sprinkle cocaine on the oreos? What are the science issues that need to be addressed head on?

The crisis is the same as it always has been: popular news reporting crisis. The issue is that what gets reported as science bears no relation to what science actually is.

This Economist article is a serious step up from popular news reporting on supposed "science," but IMHO it's still pretty awful. There are serious structural issues in science in terms of how to share data as data gets bigger and more complex, how to make analysis of complex datasets more understandable and better documented, in how data generators can get proper attribution and credit, and how to deal with fields that have a dizzyingly large hypothesis space, but the Economist article doesn't hit them very well, and fundamentally misunderstands other issues. So I'll stick by my C assessment; heads and shoulders above standard newspapers, but not that great. NYTimes, Ars Technica, and occassionally the WSJ are the only places that I trust to be able to report on science in any manner resembling reality.
posted by Llama-Lime at 12:34 AM on October 23, 2013 [9 favorites]


effugas: "Supposing? This actually happened."

No, it didn't, but lets at least do ourselves the credit of linking to the actual fucking paper.

By not publishing the results of their work Amgen gets to use their findings to determine which drugs to develop while forcing their competitors to travel blindly as well as hide the bullshit that is inevitably associated with the percentages they generated, or at least appear to do so to their investors. Amgen has a direct financial interest in producing the most dramatic results they can, it makes the money they invested in the project look well spent and it makes them look like a smart buy investors, regardless of anything they may or may not have found. We have no idea what they were actually able to show, they aren't willing to share, but we do know that we have zero business taking what they say at face value.
posted by Blasdelb at 1:01 AM on October 23, 2013 [5 favorites]


But back to the bigger point...I mean....saying "My hypothesis wasn't wrong. My hypothesis was right in certain very narrow experimental conditions which prevail in my lab and may have nothing to do with the real world," seems, to this layperson, to be close enough to wrong for government work. ... So should the threshold for publication in some of the most prestigious journals in the field really be merely "an interesting result"? Shouldn't it be stronger? And if the way labs are set up and the way science is funded makes it too difficult for people to wait around until they have something stronger, isn't that a problem?
At the risk of posting way too much in one thread, I think this deserves a response. There's more to it than "my hypothesis in these narrow conditions," the fundamental utility of a scientific paper is "my data in these conditions," and if the conditions are too narrow, your paper will be limited to lower tier journals. Sure, some interpretation accompanies data, but careful readers will draw their own conclusions from the data themselves, only using the provided interpretations as loose guides.

Without having a specific example paper to go on, I would ask, why should a single paper be stronger? In the case of the Amgen papers it would certainly be nice for Amgen scientists, given their fundamental outlook on the world, to be able to pick up a journal and get the latest well vetted gene targets. But to optimize the system for that case is short sighted, because Amgen is not the only one that makes use of the data, there are others that are just looking for better understanding of the system, namely cancer researchers. And basic research should be optimized for understanding the system rather than the narrow uses of any one single downstream consumer.

I want to know about all those strange edge cases that don't "validate" in Amgen's world view, because in my world view they are still valid and useful as long they adhere to normal reproducibilty, i.e. the protocols produce the results and the researchers were honest and careful. I, and others, can make use of all that data as I try to understand what's going on, and it's quite possible that all these narrow contexts are important. For me, I know the narrow contexts are important and I learn from them, because my research paradigm assumes that every tumor is different. If all that research were never published because it didn't "reproduce," in Amgen's limited sense, then the entire field of cancer research would have been held back tremendously, because we'd still be treating cancer as a single disease rather than the many that we know it is.

More generally, the optimization problem is trying to engineer a truth-generation system to 1) maximize the amount of truth generated in a given time frame under the conditions of 2) fixed money inputs, 3) where collecting data is time consuming and expensive, and 4) there's a lot of flexibility in choosing what data to collect. In that setup publishing early and often will work out much better than one where data is siloed and released in batches. In the small-paper setting, each decision is made with more information than in the large-paper setting. In Machine Learning terms, science is fundamentally Active Learning, i.e. we use what we know to attempt to maximize the utility of every experiment. Active learning is typically exponentially faster than less directed approaches; when papers are bundled up into bigger packages that are released less frequently, you decrease the base of the exponent over undirected experimentation. I think I just accidentally argued in favor of the minimum publishable unit. Perhaps my contrarian instincts have taken over...
posted by Llama-Lime at 1:16 AM on October 23, 2013


"There's a reproducability crisis going on. Hating on the Economist doesn't obscure that fact."

There is a reproducability crisis in the same way that there was recently a Bengazi crisis, and before that an IRS crisis, and before that a Pigford crisis, and before that a New Black Panthers crisis, and during all of this a black man in the white house crisis, and in the same way that there will always be another bullshit crisis with nothing behind it because crises sell newspapers to people who don't know what the fuck they're talking about.

There are problems with statistical illiteracy in many fields, but if we're wasting our time on fuckers who know so little about biomedical research as to think PLOS ONE is a low tier journal where only bullshit is published - even if they do publish some amount of bullshit, then we're never going to get a sense of what those problems really are or what they mean.
posted by Blasdelb at 1:24 AM on October 23, 2013 [7 favorites]


If 80% of the things you bought from eBay never arrived, you'd stop shopping at eBay.

What if eBay were the only place to get those things? You'd be willing to try a few times, making it likely you'd eventually get what you needed, right?

As I said -- I think the reforms people are pushing for in science publishing are important and really valuable. But nothing we do is going to change the fact that it's extremely difficult to figure out how the world works, and it's inevitably going to involve a lot of paths that initially seem promising and draw a lot of resources but then don't pan out.

So should the threshold for publication in some of the most prestigious journals in the field really be merely "an interesting result"? Shouldn't it be stronger?

On the contrary, I think most people, including the editors of PLoS, think it should be weaker. Weak enough to allow lots of replication studies to bloom. Of course, the whole notion of a "threshold for publication" may be changing, too. In math (which is of course different from medicine in tons of relevant ways) there are things you post as preprints and never submit for publication, because they don't clear the threshold of "good enough to publish" but they do clear the threshold of "something people might find useful to have available."
posted by escabeche at 5:34 AM on October 23, 2013 [2 favorites]


I understand why MeFites are lashing back against this article. I myself cringe when I think of it falling into the hands of my fundamentalist Baptist brother-in-law who teaches young-earth pseudoscience to college students in a campus ministries outreach. But when I think about how many books are published by hack journalists mining Google Scholar, and how much that drivel influences the mythologies of our age, arguments like this one encourage me to think more for myself and to have confidence when my reason tells me a popular truth is completely implausible.

But no, I'm not the target audience of The Economist. I can also see how dangerous this can be to the market value of Enlightenment principles.
posted by jwhite1979 at 5:51 AM on October 23, 2013 [1 favorite]


that reproducibility is tough to achieve because people often don't document experiments thoroughly and it's tough to get access to the raw original data.

Maybe it's important to expand on the idea of "reproducibility" a little bit, because I think it's being used in a few different ways. Let's take a model scientific paper:

"We combined ingredients A,B,C in method D to produce a measurement E. This is evidence that X causes Y."

"Reproducibility" is the requirement that A+B+C+D gets you E. Non-reproducibility basically equates with fraud, mistakes, or unknown important factors.

As a scientist, my goal is to understand the last bit, the relation between X and Y. That is the hard part, and it requires that the sum total of results be compatible with that conclusion and that any contradictory results must have good explanations for why they are either not valid or irrelevant. I can imagine hundreds of reasons why any one paper that suggests X causes Y might be incorrect in that suggestion even though the underlying work ABCD=E is correct and reproducible. Any non-trivial conclusion really does require many different works backing it up, each approaching the problem differently (as different scientists naturally do) and potentially exposing different sources of systematic errors or misunderstandings in the problem. Individual papers are only pieces of the puzzle.

Almost all news reporting of scientific results boils this down to "Paper proves X causes Y", or if you're lucky they include some weasel words like "Paper probably proves X causes Y". Similarly it sounds like Amgen wants papers that decisively conclude "X fixes cancer", and any lesser statement is uninteresting to it. Now if that's your standard of reproducibility, then of course you're going to see lots of "wrong" papers, because most ABCD=E results are only chipping away at a problem which just cannot be conclusively "solved" in a single work. If we could write papers that prove X causes Y, we would! It would be great! But it just doesn't work that way, which is why we make statements like "nature doesn't owe us the truth".

This is why arguments about the "publication threshold" aren't very persuasive, because researchers in the field actually do need to see and evaluate all of these ABCD=E results before they accept X causes Y. The public's willingness to believe individual papers is a problem with how we communicate the business of science, and adapting the business of science so that the process better fits newspaper headlines would be entirely detrimental to the endeavor.
posted by kiltedtaco at 8:47 AM on October 23, 2013 [2 favorites]


Freeman Dyson has said that "Science is the sum total of a great multitude of mysteries. It is an unending argument between a great multitude of voices. It resembles Wikipedia much more than it resembles the Encyclopaedia Britannica."

Science is a human activity. Humans make mistakes and, as House often remarks, we all lie. And yet, the invisible mountain of facts and sublime and nuanced theories that arise from this caring, passionate discipline, in the collective, are among humanity's signature achievements. The changes brought to the quality of human existence by science - for one example - are innumerable and, while often under-appreciated, have granted far greater lifespans and leisure-time to many of us.

Andrew Dickson White, the co-founder of Cornell aptly pointed out that " Franklin’s lightning-rod did what exorcisms, and holy water, and processions, and the Agnus Dei, and the ringing of church bells, and the rack, and the burning of witches, had failed to do: protect from frequent injuries by lightning." The advances in communication that make this board possible are entirely due to centuries of scientific effort - not unexamined belief structures.

Each of us is born a natural scientist. Einstein said: "It is almost a miracle that modern teaching methods have not yet entirely strangled the holy curiosity of inquiry; for what this delicate little plant needs more than anything, besides stimulation, is freedom." In aggregate, science has always been (since Babylon) the natural enemy of the forces that would keep us ignorant and subservient. This little plant cannot be supplanted with ignorance without destroying that which makes us most human.

So ease off.
posted by Twang at 3:02 PM on October 23, 2013 [2 favorites]


People are really defensive about there being problems in science.

This is not an environment that is conducive to fixing problems in science.

My personal feeling is that all grant sizes should be increased by 25-50% to support independent reproduction, and release of all software and data (which isn't free, that stuff needs to be packaged for external use). But even then, I've seen hand wringing around there then being less grants to go around.
posted by effugas at 6:15 PM on October 23, 2013


(I should mention, my background is in computer security, one of the few fields that has deep problems in its fundamental assertions and knows it.)
posted by effugas at 7:47 PM on October 23, 2013


Ain't so easy spending laboratory resources checking other's work if all the grants that pay for said lab resources get awarded only for revolutionary work. You'll only correct bad science by making science less competitive. Ideally, we should increasing funding for the sciences, but administrators and policy makers with the NSF, EPSRC, etc. could simply stop making grants so competitive too.

I've repeatedly argued here that modern economic problems are caused by machines taking over useful work, but that the Keynesian solution of inventing make work like law enforcement, management, administration, finance, etc. is harmful, corrupt, etc., making our best long term solution to shorten the work week. There might however another mid-term economic option, throw all that Keynesian government funding towards the sciences. Answering questions in science often enough yields more questions, need for corroboration, etc., so there is actually something to spend all that money on.
posted by jeffburdges at 2:38 AM on October 24, 2013


effugas: "People are really defensive about there being problems in science.

This is not an environment that is conducive to fixing problems in science.

My personal feeling is that all grant sizes should be increased by 25-50% to support independent reproduction, and release of all software and data (which isn't free, that stuff needs to be packaged for external use). But even then, I've seen hand wringing around there then being less grants to go around.
"
The value that reproduction of data has is incredibly dependent on the kind of data that has been generated and the kinds of questions being asked of the data . For the vast majority of research grants, reproducing data will only ever conceivably be able to answer questions that are trivial and divert resources away from efforts that could be better spent instead by asking deeper questions with understanding gained or by looking at the same phenomena from a different perspective - both of which check the assumptions of previous efforts while also increasing knowledge.

Your critique suffers from the misunderstanding that scientific papers are meant to be nuggets of truth for the world to just ideally be able to blindly accept as gospel, imperiled by our imperfections, instead of the letters to the scientific community meant simply to report observations of the natural world in a way that is clear about how those observations were obtained and what the author thinks they might mean. The Economist's account of John Ioannidis' various statistical evaluations of science as a whole silently uses the sweeping statistical assumptions he makes like a drunk man might use a lamp post, for support rather than the very limited amount of illumination they are capable of. Indeed his is able to analyze the papers he does because they are each forthright about the statistical tests they use and the levels of significance they obtain, and Neuroscience, the discipline he found genuinely troubling fault in, was already in the process of doing a hell of a lot to shape up.

We're being critical of this article because it is a hit piece that repeatedly betrays the kind of undergraduate level misunderstandings of how science works that one might expect from someone who has never done it or interacted with it in the kind of meaningful way necessary to have something genuinely worth saying. There are problems with science, but this article hasn't found them.
posted by Blasdelb at 3:16 AM on October 24, 2013 [2 favorites]


It's not so much about increasing grant size as simply employing more scientists and making them more autonomous. Anytime anything really interesting pops up some other scientists will naturally start exploring it. Imho, you'd want fewer specific project grants, but more large broad based laboratory grants across more institutions. In principle, those broader grants allow scientists to be more autonomous and expend less effort corresponding with the funding agency. Also restrict institutional overhead dramatically. There is work so suspicious that nobody will waste their time delving further into it, but blogs should still discuss it as "stuff nobody actually believes".
posted by jeffburdges at 6:51 AM on October 24, 2013


"... the misunderstanding that scientific papers are meant to be nuggets of truth for the world to just ideally be able to blindly accept as gospel...

This is exactly the problem we have in our work. I'm not a Scientist (I have a B.Sc), but I work in an area where we deliver "scientific" results, based on a rational process, with multiple data quality checks, to management. Overall, our results are pretty good, and they're certainly much more accurate than they need to be given the underlying variability in the data.

Our management makes it clear every year that they don't tolerate errors, and that any updates to the existing results after a report goes out will only lower the credibility of our process, even if accuracy improves.

I think the broad message of the Economist article is what blasdelb says above, given the Economist's audience of managers and executives. "We want the results to be totally reliable and ageless, because we're unwilling to tolerate any ambiguity, and we won't take responsibility for the results of any decisions base on these results." This is a failure of management, not of science.
posted by sneebler at 6:03 PM on October 24, 2013 [2 favorites]


Blasdelb,

For the vast majority of research grants, reproducing data will only ever conceivably be able to answer questions that are trivial

If this is true, then the vast majority of research grants are not supporting science. That which cannot fail reproduction cannot be falsified. The gold standard of science is falsifiability.

You are unfortunately making the case for your opponent, arguing that work should not be validated, but merely explored in a manner that allows room for conflicting interpretations, then arguing that science is merely communication between scientists and not intended to reflect the nature of the actual world. You attack your opponent's reasoning in an ad hominem manner, then admit his reasoning found genuinely troubling fault, but claim that fault had already been discovered internally. I guess that makes Ioannadis's observations less valid, that he was not the first to be concerned.

Tell us, then. What are the problems with science?

jeff--

Yes, I'm convinced that skulking oversight of scientists, whereby more time is spent begging to do work than actually doing work, is a direct cause of the present state of affairs. Those with oversight would hate to see a null result.
posted by effugas at 6:12 PM on October 24, 2013


sneebler--

Indeed. The toxicity of bad management religion is *exactly* what we cover up, when we insist nothing is wrong.
posted by effugas at 6:13 PM on October 24, 2013


"If this is true, then the vast majority of research grants are not supporting science. That which cannot fail reproduction cannot be falsified. The gold standard of science is falsifiability."
You are still thinking of science an an enterprise concerned with generating TrueFacts to the exclusion of supposed facts that are not true. Not only is this fundamentally impossible on an ontological level in very important ways, but the pitfalls this orientation will lead you straight into will only leave you incredibly frustrated if you ever need to directly evaluate scientific evidence. The whole point is not to say this is true and that is not, but to use data in clever ways to generate or improve theoretical models that usefully explain natural phenomena.

Almost by definition a theoretical model cannot be perfectly correct, it is a model of the truth, our best attempt to create a mirror image of it in a form that we can understand - and our understanding will never be perfect or without distortion. The map is not the territory, and ceci n'est pas une pipe. For example, there are a variety of ways to produce really awesome models of the 3D shape of biological macromolecules but none of these strategies can produce for you the real structure. NMR spectroscopy, X-ray crystallography, and electron microscopy each have their advantages and their disadvantages, and none of them will ever give you quite exactly the biological truth, though they can each provide incredibly valuable answers to specific kinds of questions. While this problem is universal to all of science it really ends up having very little effect on the practical application of scientific principles, though it does create the non-intuitive weirdness in the philosophy and communication of science that you seem to be running into.

Being fundamentally unable to create perfect models for understanding things we have to be content with generating good ones. For a theory to be a good one, it must be validated by solid data from diverse sources and approaches, explain natural phenomena, and be useful for making verifiable predictions of what those phenomena will do. For example, the theory of evolution by natural selection is all of these things while the theory of Intelligent Design is only able to explain phenomena based on subjective reasoning that can neither be repeated nor viewed and appreciated as the same from other perspectives. That does not mean that creationism as an artistic representation of who we are through a metaphorical description of where we come from is stupid, bad, or even unreasonable. However it does mean that as an explanation of natural phenomena it is unverifiable, as well as more importantly, fundamentally not useful. This is the biggest reason why Intelligent Design has no place being taught next to or as a viable replacement for the extraordinarily useful theory evolution in a science classroom. To do so would be to fail to teach science, not just as the collection of facts your teachers tried to cram into you once upon a time but the as the practice of trying to understand the natural world in an intellectually honest way.

Rote reproduction is generally not necessary because when scientists are sufficiently clever, asking good questions that would lead them to develop good models, there should be better and more useful ways to attempt to falsify those models then just doing the same thing all over again. A good scientist doing most forms of basic research is always thinking about what they will do with the answers they get, such that either they or others will be able to both verify their shit and continue asking better questions of the new model proposed rather than just wasting cycles asking the same one over and over again, which is overwhelmingly likely to only produce trivial answers. This is part of being good stewards of the precious resources we get. We do have to be careful at each of these steps to check the assumptions of the models we use and interrogate, both with our own and other's data, and published scientific papers are intentionally designed to help us do this. When you get past the TrueFacts model for understanding science to a model based one, you can see how even a paper that happens to develop a model that is 'wrong' can still be incredibly useful if the research is designed and the paper is written properly, and even a paper that makes conclusions that are good can be really harmful to understanding if designed badly.
"You attack your opponent's reasoning in an ad hominem manner, then admit his reasoning found genuinely troubling fault, but claim that fault had already been discovered internally."
Ioannidis is not my opponent. He is a guy who did some ontologically complex research somewhat poorly who stole the spotlight away from some actual statisticians with better methods by saying ridiculous shit he couldn't support. You can follow the whole drama here,
Why Most Published Research Findings Are False
There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research.

Why Most Published Research Findings Are False: Problems in the Analysis
The article published in PLoS Medicine by Ioannidis makes the dramatic claim in the title that “most published research claims are false,” and has received extensive attention as a result. The article does provide a useful reminder that the probability of hypotheses depends on much more than just the p-value, a point that has been made in the medical literature for at least four decades, and in the statistical literature for decades previous. This topic has renewed importance with the advent of the massive multiple testing often seen in genomics studies.Unfortunately, while we agree that there are more false claims than many would suspect—based both on poor study design, misinterpretation of p-values, and perhaps analytic manipulation—the mathematical argument in the PLoS Medicine paper underlying the “proof” of the title's claim has a degree of circularity. As we show in detail in a separately published paper, Dr. Ioannidis utilizes a mathematical model that severely diminishes the evidential value of studies—even meta-analyses—such that none can produce more than modest evidence against the null hypothesis, and most are far weaker. This is why, in the offered “proof,” the only study types that achieve a posterior probability of 50% or more (large RCTs [randomized controlled trials] and meta-analysis of RCTs) are those to which a prior probability of 50% or more are assigned. So the model employed cannot be considered a proof that most published claims are untrue, but is rather a claim that no study or combination of studies can ever provide convincing evidence.

Why Most Published Research Findings Are False: Author's Reply to Goodman and Greenland
I thank Goodman and Greenland for their interesting comments on my article. Our methods and results are practically identical. However, some of my arguments are misrepresented:
Here is that separate paper,
ASSESSING THE UNRELIABILITY OF THE MEDICAL LITERATURE: A RESPONSE TO "WHY MOST PUBLISHED RESEARCH FINDINGS ARE FALSE"
A recent article in this journal (Ioannidis JP (2005) Why most published research findings are false. PLoS Med 2: e124) argued that more than half of published research findings in the medical literature are false. In this commentary, we examine the structure of that argument, and show that it has three basic components:
1) An assumption that the prior probability of most hypotheses explored in medical research is below 50%.
2) Dichotomization of P-values at the 0.05 level and introduction of a “bias” factor (produced by significance-seeking), the combination of which severely weakens the evidence provided by every design.
3) Use of Bayes theorem to show that, in the face of weak evidence, hypotheses with low prior probabilities cannot have posterior probabilities over 50%.
Thus, the claim is based on a priori assumptions that most tested hypotheses are likely to be false, and then the inferential model used makes it impossible for evidence from any study to overcome this handicap. We focus largely on step (2), explaining how the combination of dichotomization and “bias” dilutes experimental evidence, and showing how this dilution leads inevitably to the stated conclusion. We also demonstrate a fallacy in another important component of the argument –that papers in “hot” fields are more likely to produce false findings.
We agree with the paper’s conclusions and recommendations that many medical research findings are less definitive than readers suspect, that P-values are widely misinterpreted, that bias of various forms is widespread, that multiple approaches are needed to prevent the literature from being systematically biased and the need for more data on the prevalence of false claims. But calculating the unreliability of the medical research literature, in whole or in part, requires more empirical evidence and different inferential models than were used. The claim that “most research findings are false for most research designs and for most fields” must be considered as yet unproven.
The arguments made do take some amount of statistical understanding to interpret, but the smack down performed was hard enough that, as you can notice perusing Google Scholar, Ioannidis has not continued to publish in this area.
"Tell us, then. What are the problems with science?"
Boring, complicated, and filled with petty drama.
posted by Blasdelb at 2:04 AM on October 25, 2013 [4 favorites]


At this point, I'm convinced management, and most administration, must simply be automated, sneebler. Automating most management and administrative tasks appears easier than automating many teaching tasks, like grading. Automated management decisions should prove more fair and audible, and eventually more accountable. Also, there are simply too many managerial decisions that go way over managers heads, like staffing decisions that require stochastic calculous.
posted by jeffburdges at 2:28 AM on October 25, 2013


For the vast majority of research grants, reproducing data will only ever conceivably be able to answer questions that are trivial

If this is true, then the vast majority of research grants are not supporting science. That which cannot fail reproduction cannot be falsified. The gold standard of science is falsifiability.
I'm not following this at all. What does any of this have to do with falsifiability? And to claim that the vast majority of research grants are not supporting science? What is your experience that lets you make this grand claim?
You are unfortunately making the case for your opponent, arguing that work should not be validated, but merely explored in a manner that allows room for conflicting interpretations, then arguing that science is merely communication between scientists and not intended to reflect the nature of the actual world. You attack your opponent's reasoning in an ad hominem manner, then admit his reasoning found genuinely troubling fault, but claim that fault had already been discovered internally. I guess that makes Ioannadis's observations less valid, that he was not the first to be concerned.
Research is validated all the time, usually within the same paper, because peer reviewers usually require evidence from multiple different lines before they'll permit a claim to be made. So reproducing these same experiments, again, is usually quite trivial, because you're going to get the same data out again. Therefore no new information. Therefore triviality. If the same resources were instead devoted to collecting new data from an expanded set of conditions or cell lines, or some other sort of generalization, we'd get a lot more information. And valuable resources would not be wasted on trivialities. How do you decide what is trivial and what is valuable? Usually we leave this up to the people that have experience with the behavior of various lab techniques, the experimental models, and all the technology.

I think the problem may be that you're still laboring under the assumption that, for example, cancer studies do not reproduce. You started by linking to that stinker of an editorial, apparently missing the fact that the popular interpretation (how it was reported in newspapers) had been completely debunked in this thread, as well as in a previous Metafilter thread. And rather than respond to getting called out about that, you mentioned some BS "science" article to try to claim that science has big problems. But now you're back to science not validating, despite not having any apparent reason to believe your strange claim. Is it just a lingering feeling from a newspaper article about an editorial? Do you have some other reason to think that? And I should add, that this is a very serious accusation, that strikes to the core of basic scientific responsibility.

If I read a crappy news article that conflated the Dual_EC_DRBG RNG with ECC, and then spout off in a thread with several cryptographers the contention that ECC has a huge NSA backdoor problem, refused to acknowledge their corrections and called them defensive, and then suggested a requirement that 25% of all cryptography grants go to routing out NSA backdoors in ECC, what do you think the response would be? In your field of computer security, is factual incorrectness just tolerated with good nature? Because it isn't in science. And I do get very defensive about that. You should hear the yelling in our office when somebody digs in on a misremembered fact. (i.e. the raw data.) And though there's lots of yelling and stern words, we all happily get back to working together afterwords after we've all been corrected by the data. So if I come across as harsh, I certainly don't direct any harshness to you personally, it's just a reaction focussed on those particular statements.

The most fundamental thing to understand about science papers is that they are not textbooks. Things get into textbooks only after many validations, across many papers. Papers are a way to communicate early data, early interpretations, and to talk to each other. This does not mean that the conclusions are not falsifiable, they most definitely are, but most conclusions are dependent upon the mental framework upon which the scientists operate. And since science is a way to change mental frameworks to better fit the world, the initial starting point of these mental frameworks is going to shift, and most of the conclusions in papers will eventual found to be wrong, or not even wrong. However, the data lasts forever, and is always useful regardless of the interpretive framework. Some data is carries more information than other data. In contrast, the news reports each new paper as a seminal discovery, set in stone. This is almost never how it works. And it can't work that way, otherwise it wouldn't be science.
posted by Llama-Lime at 3:49 PM on October 25, 2013 [1 favorite]


« Older Dealing with the KNOW-IT-ALL   |   high spirits Newer »


This thread has been archived and is closed to new comments