Rashomon in the hallway
March 11, 2012 7:55 PM Subscribe

In 1996, Yale psychologist John Bargh published a much cited paper (pdf) demonstrating the "priming effect" --- in a nutshell, subjects who had to unscramble sentences mentioning the elderly walked slower when leaving the examination room than control subjects. This year, Stéphane Doyen and his co-authors attempted to replicate Bargh's experiment, but were unable to reproduce the priming effect --- instead assembling evidence that it was the experimenter's knowledge of the study topic which created the apparent "priming". Is Bargh's famous experiment flawed? Or is Doyen's paper a pile of horseshit published in a two-bit for-profit online journal, as Bargh's strident critique suggests? Or is Bargh full of it himself? And who gets to decide what counts as good science these days anyway?
posted by Diablevert (51 comments total) 35 users marked this as a favorite

What a great post. Thanks. I started with the "full of it himself" link and man is it rich. I can spend an hour just following up on stuff like this:

There is a wider issue here. A lack of replication is a large problem in psychology (and arguably in science, full stop). Without it, science has lost a limb. Results need to be checked, and they gain strength through repetition. On the other hand, if someone cannot repeat another person’s experiments, that raises some serious question marks.

Scientists get criticised for not carrying out enough replications – there is little glory, after all, in merely duplicating old ground rather than forging new ones. Science journals get criticised for not publishing these attempts. Science journalists get criticised for not covering them. This is partly why I covered Doyen’s study in the first place.

In light of this “file drawer problem”, you might have thought that replication attempts would be welcome. Instead, we get an aggressive and frequently ill-founded attack at everyone involved in such an attempt. Daniel Simons, another noted psychologist, says, “[Bargh’s] post is a case study of what NOT to do when someone fails to replicate one of your findings.”
posted by mediareport at 8:05 PM on March 11, 2012 [11 favorites]

Starting off your criticism of an academic article with a wholesale bashing of PLoS (a 503(c) organization) as a "for-profit" publisher is not good.
posted by demiurge at 8:15 PM on March 11, 2012 [8 favorites]

Bargh lost me when he devoted 2 paragraphs to an extremely naive criticism of PLOSOne. Are page charges considerably less common in social science journals? did he not bother to even go to the PLOS website to see the word "nonprofit" across the top of the screen? Does he actually think the journal in which his original article was published isn't a revenue-generating mechanism for APA?
posted by juliapangolin at 8:19 PM on March 11, 2012 [5 favorites]

It's amusing that he spends so much time railing against PLoS for thinking they are for-profit on his Pyschology Today blog, where he's probably getting paid per click.....
posted by a womble is an active kind of sloth at 8:28 PM on March 11, 2012 [3 favorites]

If I'd been asked to review it (oddly for an article that purported to fail to replicate one of my past studies, I wasn't)

I'd think this would be a pretty clear conflict of interest. And from the P1 website:

We also, as much as possible, try to rule out those reviewers who may have an obvious competing interest, such as those who may have been collaborators on other projects with the authors of this manuscript, those who may be direct competitors, those who may have a known history of antipathy with the author(s), or those who might profit financially from this work.

(Many journals, P1 included, also have a mechanism for suggesting 1-2 potential referees to exclude, for similar reasons, though the editor decides whether it would be appropriate.)
posted by en forme de poire at 8:40 PM on March 11, 2012 [3 favorites]

that raises some serious question marks.

Is this a thing?
posted by ODiV at 9:00 PM on March 11, 2012 [2 favorites]

Remember FIAMO, everyone.

Anyway.

What does this priming effect do in modern psychology? Is it widely used in treatments? If the concept becomes discredited, what effect will that have on the wider world?
posted by kavasa at 9:04 PM on March 11, 2012

This is huge.
posted by Meatafoecure at 9:08 PM on March 11, 2012 [1 favorite]

Is it widely used in treatments?

Experimental psychology isn't about "treatments."
posted by ethnomethodologist at 9:16 PM on March 11, 2012 [2 favorites]

Me and all my scientist friends went "oooooooooooh!" when we heard this; standing there all smugly in our labcoats, fidgeting absently with graphing calculators and slide rulers.
posted by dobie at 9:18 PM on March 11, 2012 [6 favorites]

What does this priming effect do in modern psychology? Is it widely used in treatments? If the concept becomes discredited, what effect will that have on the wider world?

Priming is important in social psychology (concerned with how people think, basically), not clinical psychology (which is concerned with treating psychological ailments, essentially). No one is suggesting that priming is going to be discredited--there is a wide, wide swath of studies which have proven it in nicely double-blind studies (which is the basic property that Doyen's study had which Bargh's did not).

All that Doyen's study really shows is that priming people with words related to age does not affect the walking speeds of people. Priming with age-words may still cause people to reflect on their own mortality, make them miss their grandparents, or many other things.
posted by TypographicalError at 9:22 PM on March 11, 2012 [8 favorites]

It's interesting. Here we have a social scientist who has done high-profile research on a theory that explains human behaviour in terms of hidden influences, and that same scientist reacts with hostility and denial to suggestions that he and his work might be subject to hidden biases. I have friends doing research in psychology, and I've seen this sort of thing before.
posted by smorange at 9:25 PM on March 11, 2012 [5 favorites]

This might shed some light.
posted by Shane at 9:29 PM on March 11, 2012 [3 favorites]

Not a social psychologist, but I am an economic sociologist, and I co-author with several, including experimentalists. First off, I should say all of the ones I work with are very, very concerned about methods, as are the reviewers for all of the top journals. There has been a lot of concerns raised recently about replicable results and methodology in social psych, and it is a problem that we face throughout the social sciences - you can't get articles published in top journals by reproducing results directly, and working with people creates all sorts of messiness that makes Mertonian approaches to falsifiable science harder than in the physical sciences.

But, I should say to those filled with glee, every social scientist I know puts lots of effort into trying to get this right, and journal reviewers are incredibly tough on methods. Don't paint this with too big a brush, experimental social psych (and sociology, and economics, etc.) hate bad research, it is just harder to figure out what is bad than testing a physical process. Seriously, hard scientists, how would you like it if your reactions acted radically differently depending on the time of day, their mood, their social networks, and every other possible confounding condition? This stuff can be hard.

That being said, priming is a big deal, so this study is interesting, though priming itself has been replicated in other contexts. Lots can go wrong in an experimental replication, and, while Bargh seems to go a bit nuts in his reply, he does raise some valid points, including about PLoS, which, while a noble effort, is not considered a top journal in the field (though he obviously doesn't understand open access). This should provoke some additional interesting work on the boundary conditions around priming. I'll ask my social psych folks about it tomorrow, and post more if people are interested.
posted by blahblahblah at 10:09 PM on March 11, 2012 [5 favorites]

It's totally normal for an author of a paper to be asked to review the follow-up. Who better to be acquainted with the material? Yes, the author is likely to take a dim view of a work that's basically a big ol' "NUH-UH" to his paper, but that's what second and third reviewers are for.

Just recently, my boss was asked to review a big name in the field (also experimental psychology, as it happens) that said that one of my boss's highly influential, commonly-used measures of a certain kind of dynamic process was an artifact of another process. My boss's review was basically, "The measure is called 'Conditional [Blankity Blank]'. You didn't conditionalize correctly." He was able to write what basically amounted to an exam question and answer that he gives yearly to his undergrads. No other reviewer would be as clear in explaining exactly what was wrong with the paper.

This example is a textbook case of how the process can go wrong. It just happened to be aired publicly instead of in idle gossip over email and at conferences. PLoS One admittedly doesn't have the best reputation, but that undercuts this guy's reasoning for having his pants in a knot. He should have written a calm, collected smack down of a response rather than whining.
posted by supercres at 10:21 PM on March 11, 2012

Man, I could really go some prunes right abou...ooh, Matlock is on!
posted by obiwanwasabi at 10:23 PM on March 11, 2012 [1 favorite]

It's totally normal for an author of a paper to be asked to review the follow-up.

It's also totally normal to offer that author a chance to write a response article if the follow-up gets published. Unfortunately, PLoS ONE does not publish responses because, they argue, responses should go in the comments section. Which is in one sense reasonable, but also does not fall in line with the traditional scientific publishing / knowledge framework, so leads instead to poorly written blog posts.
posted by one_bean at 10:48 PM on March 11, 2012 [1 favorite]

I'll ask my social psych folks about it tomorrow, and post more if people are interested.

Please! Also, if you wouldn't mind, before asking the question, prime half of them to think about jerks, and the other half to think about the Steve Martin movie The Jerk.
posted by No-sword at 11:22 PM on March 11, 2012 [3 favorites]

And who gets to decide what counts as good science these days anyway?

Statisticians. Cranky, pedantic statisticians.
posted by Panjandrum at 11:24 PM on March 11, 2012 [10 favorites]

Back when I was doing my PhD, I was trying to decide on a priming methodology to model (I'm a statistical modeler) and I tried replicating a famous subliminal affective priming result from social psychology. Famous, as in, it's talked about in intro textbooks. I tried to get the materials from the original authors. No response. So I created my own materials. Results? nada. No priming effect (none that could be attributed to subliminal priming, anyway). The original study showed a massive effect, so it wasn't that I lacked power. I moved on to a different paradigm, and that failure to replicate will probably never be published.

I was talking with a prominent priming researcher about this, and as it turns out, he's tried replicating it too. And failed. So there are at least two failures to replicate out there (probably more) and no one has published them. But even if we did try to publish, the authors we are trying to replicate could complain about little issues of methodology (after all, we didn't have their materials!) being different. In the end, for me, it would not be worth the time to try publish. I can always move on to another technique to study, since I'm a methodologist. But this is a problem, and I'm sure glad my PhD work didn't depend on replicating that result. With the increasing use of extremely high cost methodologies like fMRI, it seems like replications will be an even bigger problem because it simply costs so much money and time to do them.

Meanwhile, I've had at least two other researchers successfully replicate some high-profile cognitive experimental work I did. So I'm happy about that. Things don't appear to be as bad in cognitive psychology as they are in social psychology.
posted by Philosopher Dirtbike at 12:22 AM on March 12, 2012 [7 favorites]

If priming exists, this link will hobble you.
posted by benzenedream at 12:44 AM on March 12, 2012

Statisticians. Cranky, pedantic statisticians.

I have a couple staying at my house this week. I'll get back to you all about the good science thing once I get them to agree on beer.
posted by brennen at 2:36 AM on March 12, 2012 [1 favorite]

Seriously, hard scientists, how would you like it if your reactions acted radically differently depending on the time of day, their mood, their social networks, and every other possible confounding condition?

Reminds me of the chemical interaction that normally takes 80 minutes but would take an hour and 20 minutes when the experimenter wore a red tie.
posted by Obscure Reference at 4:37 AM on March 12, 2012 [3 favorites]

Philosopher Dirtbike, do you mind saying what the result was? Or memail me if it's a secret...
posted by myeviltwin at 5:50 AM on March 12, 2012

I should point out that priming experiments constitute a very broad category of experimental paradigm in psychology, and has been around for decades. This particular scientific debate is not about the "priming effect."

Unlike molecular biology and chemistry, psychology makes use of human participants who introduce much larger variability into statistical inference. Moreover, an experimental setup might allow for demand characteristics to influence the behavior of participants. In particular, subjects are prone to respond to subtle cues from the experimenter which indicate the desired outcome. Controlling for demand characteristics is of particular importance.

In this case, it seems the experimenters wished to find evidence that semantic priming could produce a behavioral outcome. The classic confound would be that if the experimenter knows which sentences the participants were unscrambling, they themselves might adopt the desired hunched posture and slow movements. A participant might simply, then, be mimicking the experimenter.

A further complicating issue is that many experimental psychologists interpret low p-values to imply replicability and correctness, when this is not necessarily the case. Add to this that a standard confidence level of 95% is typically chosen for any result (and that papers tend to have multiple results without controlling for multiple tests) , you have inaccurate inferences that would allow for Type I errors well over 5% of the time.

This is why it is so important to take attempted replications of your studies at face value, and not get in these tiffs. The only good resolution to this would be a third attempt at replication from yet another lab group, taking into account the results produced by either of the first two.

But this isn't how science is "broken"! This is exactly how it works.
posted by phenylphenol at 5:54 AM on March 12, 2012 [9 favorites]

Someone should start a journal that does nothing but publish attempts to reproduce results published in other journals -- no matter what kind of science. I'd think a lot of scientists would be interested in that, no?
posted by empath at 5:57 AM on March 12, 2012 [3 favorites]

Yes! That would be extremely interesting. And true for so many scientific fields.
posted by a womble is an active kind of sloth at 6:06 AM on March 12, 2012

> A lack of replication is a large problem in psychology (and arguably in science, full stop).

There isn't any decline effect in experimental physics. Lack of replication just means the original result was wrong (FTL neutrinos). That gives you a good clean tool for mapping out the sciences and marking them off from the "sciences." Got a decline effect? Your field is a "science."
posted by jfuller at 6:17 AM on March 12, 2012

Seriously, hard scientists, how would you like it if your reactions acted radically differently depending on the time of day, their mood, their social networks, and every other possible confounding condition?

This is why I stopped doing experimental molecular biology and switched to computational work where things usually seem more deterministic.
posted by grouse at 6:36 AM on March 12, 2012

Someone should start a journal that does nothing but publish attempts to reproduce results published in other journals

The Journal of Reproducible Results ?
posted by mikelieman at 6:57 AM on March 12, 2012 [2 favorites]

Marginal Revolution blogger, Alex Tabarok, has a good discussion of some of the more subtleties of the replication involving the stop watch for those interested.

http://marginalrevolution.com/marginalrevolution/2012/03/walking-fast-and-slow.html

Apparently, they replicated the findings only when using the stopwatch and priming those who were using the watch by telling them that they expected the infrared technology wasnt very reliable and that they expected one group to walk faster than the other.

Did they have a control group on the stopwatch that wasn't primed I wonder? If you found the same result regardless of priming the watchers ("who [primes] the watchmen?"), then it would suggest it was t subconscious researcher bias (a kind of Hawthorne effect) but rather just bad measurement.

So better methods means more accurate instruments for collecting the data, and double blind the study.
posted by scunning at 7:13 AM on March 12, 2012

(oh and bigger samples.)
posted by scunning at 7:13 AM on March 12, 2012

Even a fantastic, reproducible experimental psych study can be pure hokum. The observations are often done in a completely contrived environment, mostly on 18 and 19 year-old college freshman. So next time you hear something like, say, "people have less willpower after working on a hard task", think "18-year olds are more likely to eat a cookie after working on a puzzle than after watching a video." Suddenly the results seem a lot less relevant.

Many of these experiments test cognitive skills and coping strategies in people whose brains and personalities are still developing rapidly. 18-year-old students are not a great proxy for the adult population at large when it comes to, say, medical research, but they're an even worse proxy when it comes to psychological research.
posted by Ausamor at 7:42 AM on March 12, 2012

There isn't any decline effect in experimental physics.

Surely you're joking.
posted by en forme de poire at 8:00 AM on March 12, 2012 [7 favorites]

en forme de poire moral of the story, accuracy won out specifically because it wasn't objectively repeatable. Social in social sciences serves the same purpose as special in special olympics.
posted by karmiolz at 8:50 AM on March 12, 2012

Karmiolz, the point is not (just) that the original measurement was inaccurate. The point is that subsequent published measurements slowly converged, monotonically, to the real value, rather than jumping to or being distributed around the real value. The explanation Feynman gives is that nobody wanted to publish a value that was too far off from the "accepted" result:

When they got a number that was too high above Millikan's, they thought something must be wrong - and they would look for and find a reason why something might be wrong. When they got a number close to Millikan's value they didn't look so hard. And so they eliminated the numbers that were too far off, and did other things like that...

So even a totally objective, empirical measurement can still be influenced by something like the file-drawer effect, or subtle biases on the part of the experimentalist.
posted by en forme de poire at 9:32 AM on March 12, 2012 [3 favorites]

Philosopher Dirtbike, do you mind saying what the result was? Or memail me if it's a secret...

I don't know if it would be ethical to spread un-peer reviewed, pilot results on a nonscientific public forum like this. But I'll memail you.

en forme de poire: So even a totally objective, empirical measurement can still be influenced by something like the file-drawer effect, or subtle biases on the part of the experimentalist.

This. And also, aspects of the *culture* of a field that don't necessarily mean that everyone in the whole field is nonscientific.

jfuller: There isn't any decline effect in experimental physics. Lack of replication just means the original result was wrong (FTL neutrinos). That gives you a good clean tool for mapping out the sciences and marking them off from the "sciences." Got a decline effect? Your field is a "science."

Hey, stop the presses! jfuller just solved the demarcation problem! /sarcasm
posted by Philosopher Dirtbike at 9:53 AM on March 12, 2012 [1 favorite]

I think a larger issue for social psychology isn't whether the results can be replicated but whether the results, true or not, apply to a substantial question about nature. Too much work in social psych amounts to "college students ("people") do something surprising" and Bargh's priming work is no exception. Unfortunately, surprising != substantial.
posted by serif at 9:55 AM on March 12, 2012 [1 favorite]

en forme de poire Repeat experimentation allowed the actual value to be discerned. It's not as if the oil was not a valid experimental model to discern the elementary charge. Science corrected itself in the face of evidence, whereas social sciences can reach opposite conclusions on the same "evidence." The experiments they run are glorified case studies. They are interesting curiosities, not actual informative science.
posted by karmiolz at 12:32 PM on March 12, 2012

The experiments they run are glorified case studies. They are interesting curiosities, not actual informative science.

Do you think it can be dispensed with entirely, then? That subjects like anthropology or psychology are not worthy of study?
posted by Diablevert at 12:41 PM on March 12, 2012

Reading the Bargh experimental description, he's trying to quantify a hypothethical psychological phenomena across a heterogeneous group of people with a sample size of....n=34....HAHHAHAHAHAHAHA... (falls over)
posted by storybored at 2:10 PM on March 12, 2012

Diablevert Not at all, lets just not call it science when, as storybored points out n=34, you are dealing with nearly innumerable uncontrolled variables. They are worthy of study, we should just be a little more reserved in the conclusions we draw. An anthropologist spending years studying a culture, describing it, drawing parallels with others and positing a consistent human condition is far different than spending an afternoon marking various reaction of a handful of people and claiming an overarching psychological effect.
posted by karmiolz at 2:25 PM on March 12, 2012

This is a totally different argument, though, from saying that experimental physics is immune from the factors that underlie the "decline effect." You're also insisting that there's a qualitative difference between experimental physics and social psychology even though your points of distinction are quantitative (number of measurements/subjects, degree of experimental control, etc.).
posted by en forme de poire at 2:40 PM on March 12, 2012

Can't tell if it operated in the incident in the FPP, but one disturbing effect in sciences is that senior "scientists" will sometimes use their positions as editors or reviewers to try to suppress results that conflict with theirs. This anti-scientific behavior is not recognized by them as such, since they sincerely believe they are protecting their colleagues from bad science, but it is they who are practicing bad science. It is the inevitable outcome from identifying with ones theories too much and not loving the science itself enough.
posted by Mental Wimp at 3:58 PM on March 12, 2012

The decline effect is essentially the "saving the phenomena" stage of scientific understanding. I fully agree that scientific knowledge changes, complete Thomas Kuhn fan. The fact is that in the hard sciences and engineering, you have very controlled settings and very limited statements. The social sciences are not half as vigorous yet their conclusions are far more sweeping. Science requires the theory or result to have predictive ability, it is tossed out if it doesn't. Social sciences have this nasty habit of make predictions that are completely contradicted, then confirmed, then amended, contradicted again, re-confirmed, it is all over the place with very little consistency.
posted by karmiolz at 4:09 PM on March 12, 2012

The decline effect is essentially the "saving the phenomena" stage of scientific understanding.

It's not clear to me how saving the phenomena (essentially, making predictive models from data, right?) has really anything to do with the decline effect, which has to do with 1) regression to the mean effect size and 2) selection bias that affects the reporting of experimental results.
posted by en forme de poire at 4:35 PM on March 12, 2012

I referenced Kuhn because his use of that term, saving the phenomena, is for when a model isn't perfect. Deviation from expected values often have caveats that attempt to save the model as a whole. Jonah Lehrer's use of the term "decline effect" has the regression toward the mean, with the selective bias showing in the founding experiments. Again, in the hard sciences, you are making specific predictions that are testable and repeatable. In the social sciences you are not nearly as detached and objective. I am not saying scientists are infallible, I'm saying that if you make a claim, it can be tested, and shouting matches of "yes it is!" "no it isn't!" don't amount to much because the data speaks for itself.
posted by karmiolz at 5:43 PM on March 12, 2012

Selection bias doesn't show up in the founding experiments, necessarily: you see it in the experiments that come later, because the pressure is to more or less agree with previous results. My point was that these problems also extend to the hard sciences. (I still don't think this really relates to "saving the phenomenon," by the way. That refers to adjusting a model to fit the measurements, rather than the other way around.)

As for making testable claims, psychology is also an experimental science so I don't think this is actually a point of departure -- claims in psychology can be tested and ultimately supported, refined, or refuted by experimental results. Undoubtedly it's harder to control all of the relevant variables in psych vs. (say) chemistry, but that alone doesn't make the basic underlying process unscientific. Lack of replication and the substitution of conceptual replication strike me as much bigger problems, but I think they aren't intrinsic to the field as much as unfortunate current practice, which people like Doyen &c appear to be working against.
posted by en forme de poire at 7:16 PM on March 12, 2012

Someone should start a journal that does nothing but publish attempts to reproduce results published in other journals

They could work with the Journal of the Null Hypothesis
posted by hattifattener at 12:08 AM on March 13, 2012

I fully agree that scientific knowledge changes, complete Thomas Kuhn fan.

What? How does Thomas Kuhn fit into it? All he did was point out that sometimes the changes are major ("paradigm shifts") vs. slow modification. But scientists have always been about changing knowledge. In fact, it's always been the purpose of science.
posted by Mental Wimp at 9:58 AM on March 13, 2012

« Older “Y’know Moeson, you really can’t do that kind of... | "I still have Buffy taste in my mouth." Newer »

This thread has been archived and is closed to new comments

MetaFilter

Rashomon in the hallway
March 11, 2012 7:55 PM Subscribe

Tags

Share

Rashomon in the hallway March 11, 2012 7:55 PM Subscribe

Tags

Share

Rashomon in the hallway
March 11, 2012 7:55 PM Subscribe