Science meets professional subjects
February 16, 2015 1:26 PM Subscribe

Amazon's Mechanical Turk has become an important tool for social science research, but a fascinating piece by PBS Newshour discusses why this might be a problem, with a great profile of professional survey takers, who average hundreds, even thousands of social science surveys each. This is not just idle speculation, recent research [PDF] shows that experienced Turkers no longer have typical "gut reactions" to social experiments, creating a struggle with how to deal with non-naivete [PDF]. Take a look at the questions that professional Tukers are asked the most, and be sure to take the survey in the middle of the first article!

By way of personal disclosure, I am an academic who has used MTurk for particular, narrow types of research, and agree that these issues are real - in an experiment I completed yesterday, a full 25% of highly-rated Turkers messed up a basic attention check question (along the lines of "This survey is about cheese"). I should also note that a growing number of academics are aware of the problem, and it is part of continual balancing act between cost, validity, and avoiding other forms of bias (like only doing research onWEIRD subjects).

posted by blahblahblah (46 comments total) 42 users marked this as a favorite

I... had no idea.

This seems like such terrible science. When preparing an experiment involving sampling, avoiding biased selection from your population should be your top priority - it's like taking filming with mud on your lens, you really can't fix it in post.
posted by lupus_yonderboy at 1:32 PM on February 16, 2015 [2 favorites]

To answer my own question - if you don't pay people on the internet to take your survey, exactly how will you gather data in the first place?

I don't really have a good solution for this, to be honest. Perhaps a study needs to be done... ;-)
posted by lupus_yonderboy at 1:39 PM on February 16, 2015

Really interesting stuff. I used MTurk for a research project once, but the workers were not filling out a survey, instead I was asking them to code data. I used multiple turkers for each piece of information so that I could verify the accuracy of each piece of data that was coded. It was wonderful. I had paid undergraduates to help me do this kind of work before, but I will never do that again.

A colleague of mine is doing research looking at another possible source of bias. He has found that just by having the same scaling (e.g. 1-5 or 1-7) across items, may also artificially inflate how related responses are. If that is true, it would have major implications for most social psychology research.
posted by bove at 1:55 PM on February 16, 2015 [4 favorites]

a full 25% of highly-rated Turkers messed up a basic attention check question

Do you, blahblahblah, give out negative ratings to people who answer check questions incorrectly? Do most other researchers?
posted by ryanrs at 1:56 PM on February 16, 2015

Bove: just by having the same scaling (e.g. 1-5 or 1-7) across items, may also artificially inflate how related responses are

If I'm understanding correctly, I think this phenomenon has already been quantified by scholars like Schwarz (PDF). Or is your friend studying MTurk specifically?
posted by redct at 2:03 PM on February 16, 2015 [1 favorite]

Do you, blahblahblah, give out negative ratings to people who answer check questions incorrectly?

I have struggled with this a lot, and I no longer do. As far as I know, the only way to give negative ratings is to reject the job, and not pay for it. This is problematic in several ways:

1) It feels dishonest. People have done the work, it seems unethical to not pay them unless I know for sure that this was not a mistake.

2) It is painful. The first time I did a MTurk survey, I tried not paying several people who I felt were being exploitative (getting questions wrong AND doing the survey sloppily by doing things like answering all questions the same way). Almost all wrote me elaborate and angry emails explaining how they never messed up before and how I was being a jerk. It absolutely wore me out, and I had to argue with them that they didn't take the survey seriously, resulting in hour long fights over $.50!

3) It kills your ability to do research. Turkers review you on places like Turkopticon. Literally everyone who I didn't pay for blowing off the work gave me terrible reviews, lowering the chance that people would do my survey.

4) It is potentially harmful to the subject. Apparently, getting negative reviews lowers the chance that Turkers will get to do future tasks. I don't want to take away people's future work because they may have made a mistake.

The situation may be different for people who use Turk to do "real" work, like translations or image recognition, but I feel that, as a researcher, I have to treat the respondents with respect and give them the benefit of the doubt. This, of course, creates a problem for others, but I don't have a good solution to it, which is part of why I only use MTurk for very narrow sorts of supplementary research, never for main findings.
posted by blahblahblah at 2:04 PM on February 16, 2015 [7 favorites]

Money quote from one of the Mechanical Turks: Hamilton summed it up this way: “I’d always considered myself a conservative, but the more of these surveys I’ve taken, I’ve realized I’m more of a moderate liberal. I’d always considered myself a Christian, but now I consider myself more of an agnostic. You just can’t fill out 40,000 surveys and not get a better sense of what you do and don’t think.”

There was a Planet Money about Mechanical Turks that covered a little more of the financial side. It seems like it's hard to make more than a few dollars an hour doing this. The lady in the video from the PBS link is remarkably decent, taking time to alert the surveyors when she finds a repeat-question or other flaw.

The whole system is tremendously amazing but despicable at the same time. Universities and whoever else get their tasks done, Amazon gets a cut and the independent contractor-status Turks make $2 or $3 an hour.
posted by skewed at 2:06 PM on February 16, 2015 [6 favorites]

This seems like such terrible science.

No, no, no. This is Disruption!
It'll all be fine.

Really!

Trust the geeks.
posted by Thorzdad at 2:09 PM on February 16, 2015

I started doing mturk last week. I have mostly done transcription and image categorization batches, but I have done maybe a total of ten academic or marketing surveys. Yes, I do them as quickly as I can, and probably make attention-check mistakes. The recommended wage on mturk is 10 cents per minute, but many of these surveys are put out a lower rate, like twenty five cents for twenty minutes of my time. I only do the surveys that offer a rate close to or above 10 cents/minute and I do them as quickly as possible, because usually the estimated time to complete is an underestimate. If you are only paying me twenty five cents, I'm going to do all that I can to not spend more than three minutes on your survey.

And honestly, I'm a little horrified with the sloppy survey design apparent in many of these. The other day I did an academic survey and the first question and answer was "do you think a non-english speaker would have an easy to difficult time understanding the following the paragraph"... which was in English.
posted by wrabbit at 2:21 PM on February 16, 2015 [1 favorite]

Been outsourcing my MiFi comments to Mechanical Turk for a few years now, no one's noticed.
posted by sammyo at 2:21 PM on February 16, 2015 [9 favorites]

You know, if it's not paying a reasonable wage (but is also not volunteer), you can make a pretty good argument it should be disallowed under ethical guidelines. Not to mention that it's obviously non-representative on the surface of it.

Both of these are good points, but they are also problematic.

On the wage side, I agree about the danger of exploitation, but I don't think I am trying to pay people's wages when I ask them to do a survey. There is a market. Survey-taking is voluntary, and, if you money you offer is low, people won't do your survey. I don't feel like I am exploiting people in this situation, because I don't believe that my MTurk use is undercutting someone's wage. Perhaps a few less undergraduates get lab money, but otherwise MTurk is generally about being able to do more experiments, faster, not about creating substitutes. Besides, I think most researchers pay at or above a reasonable wage, but a five minute survey at reasonable wages is still less than $1.

On representativeness, the issue is what you are trying to represent. A lot of the social science work being done on MTurk is trying to get at basic human behavior - how do individuals react to certain stimuli, how do they value particular things, how do they process particular problems, etc. To that extent, representativeness may not be as big a concern. In fact, there is a strong argument that Turkers represent a wider swath of the population than undergraduate students that have long been doing surveys.

Similarly, experiments, where one group gets a particular treatment and another does not, does not always need representative groups, because you are comparing treatment to control. As long as those two populations are the same, it doesn't really matter as much whether the groups represent society overall, at least in some cases.

As long as you are clear about your sampling strategy, this is not really an issue. The real problem is professional test takers. There, MTurk actually creates a stricter standard, since so many people have done so many tests. That makes experiments work less well, resulting in false negatives, which is problematic, but maybe not as much as false positives. But it also creates a strange group of respondents who may respond in weird ways.

That is why the issue is challenging. MTurk is great for some things, terrible for others. Educated scientists and good peer review are the usual ways to stop issues from occurring, and I think they work here too.
posted by blahblahblah at 2:24 PM on February 16, 2015 [3 favorites]

One potential problem here, is that peer reviewers in the social sciences find it much easier to criticise papers for the information that authors provide on their subjects - than they do for information which is not provided because it is unknown. A study which involves 20 psychology students in a lab environment - where 15 are female - will draw all kinds of comments about gender bias. But recruit 100 turkers of background and test environment unknown - and everything seems just fine!
posted by rongorongo at 2:28 PM on February 16, 2015 [2 favorites]

It seems like it's hard to make more than a few dollars an hour doing this.

Yeah, I tried to do this at a previous job, since the job involved a lot of down time and it seemed like an easy way to make some beer money, but I eventually decided it was just more trouble than it was worth for the paltry sums available. Even with a subreddit telling me which jobs paid the best, I wasn't getting enough to overcome my natural (or possibly preternatural) laziness.
posted by Steely-eyed Missile Man at 2:51 PM on February 16, 2015 [1 favorite]

Oh look! A bunch of people who can do science better than scientists! If only they read internet comments they would know what to do!

When preparing an experiment involving sampling, avoiding biased selection from your population should be your top priority

Biasedness isn't an inherent property. If a sample is biased that only matters for inference, and conditioning on a number of covariates can solve the problem depending on what you are trying to achieve.
posted by MisantropicPainforest at 2:59 PM on February 16, 2015 [1 favorite]

I find all this very exciting, I wonder if we will eventually see people so thoroughly acculturated that they mimic predicted outcomes to avoid being kicked from studies.

I'd echo the thought above that having turkers do skilled (or at least rewarding) labor in terms of coding and analysis is a wonderful thing. There are many examples, but one of my favorite tools is vatic.

That said, I believe you can get interesting results using mturk for experimental tasks. I think the issues raised above have more to do with a poor understanding of the population of possible participants and use of methods which are particularly unsuitable.
posted by ethansr at 3:04 PM on February 16, 2015

I don't feel like I am exploiting people in this situation, because I don't believe that my MTurk use is undercutting someone's wage.

And yet, it is, and this is why mTurk is not a good place for academic research. If I'm doing your survey, I'm not doing HITs that pay a fair wage and ensure I feed my children or pay my mortgage. mTurk is a labour platform, not a subject pool, and the people there are called workers by Amazon, not participants. They are not observed, so you have no idea if they're being honest about demographic questions or if they're even paying any attention. In fact, the pay makes them more likely to lie just to be sure that they don't waste time doing the demographic questions only to be disqualified. They satisfice and have scripts that allow them to pick out attention check questions based on their wording (I used novel ACs and had 50% get them wrong on one study, 33% on the other.) They sit on surveys to make it appear they took longer than they did to ensure you don't drop the pay any more than you have, so timing them gives false results. They have the CRT memorized. They work hard to grab HITs before anyone else, ensuring your subject pool is the same as my subject pool and Bob's subject pool and Myrna's, too (10% of Turkers do 80% of all HITs.) These are the dirty little secrets that have been ignored because researchers don't get input from the workers themselves, and my worry is that once these secrets become public knowledge it will lead to more and more data being second guessed because it was gathered using mTurk with outdated methods that just don't work there.
posted by spamgirl at 3:14 PM on February 16, 2015 [36 favorites]

I used to do Mechanical Turk when watching TV in the evening and I really liked doing the image categorization ones. I did a whole bunch for what became "whatbird," where you can choose what colors the bird you saw was, and it pops up a list of birds that color? I probably looked at 100s of pictures of birds and said what color its wings or head or whatever was, it was a great "during the commercials" task and quite pleasant. Paid me half a cent per picture, I think, and I felt like I was doing something useful. (And every time I use whatbird I get excited again that I had the teeniest part in identifying all those birds.) The surveys were UNBEARABLY tedious, even when they paid well. It just wasn't worth it.

Now I do "zooniverse" tasks when I'm looking for something like that ... they have plenty of pleasant image ones where I feel like I'm doing something useful, and frankly the couple of dollars I made wasn't worth the hassle.
posted by Eyebrows McGee at 3:23 PM on February 16, 2015 [5 favorites]

Spamgirl- welcome to metafilter; and your comment was terrific. I suspect for those reasons that we will see MTurk get treated with more and more suspicion. A research-centered alternative would be a good idea...
posted by blahblahblah at 3:37 PM on February 16, 2015

Another academic here who has never used but has considered using Mturk. The presentations I have seen where Mturk data has been reported have seemed to tout and to be particularly proud about the sample size - I had taken this as a good thing as well. If you have 28000 people giving you ratings on property X that seems like a sample that would have enough systematic variability that the error variability would wash out. Now I wonder if even the large sample size is a good thing because those 28000 people may be representative of people who do work for MTurk and nothing else. This strikes me as no better than having a large sample of people who are freshman taking Intro to Psych. This was a great fpp. Thanks!
posted by bluesky43 at 4:14 PM on February 16, 2015 [1 favorite]

And yet, it is, and this is why mTurk is not a good place for academic research. If I'm doing your survey, I'm not doing HITs that pay a fair wage and ensure I feed my children or pay my mortgage.

I run quite a lot of studies on Turk, and I'm sort of curious whether you think this is universally true. I'd definitely like to see the typical Turk pay rate shift upwards, and a lot of academic requesters seem to do a very poor job of recognising that Turkers are a labour force. On the other hand, I think that the guidelines posted at we are dynamo are pretty good ones, and I think a lot of my ethical reservations would start to dissipate if they were more widely adhered to.

I think the issues raised above have more to do with a poor understanding of the population of possible participants and use of methods which are particularly unsuitable.

I strongly agree with this. For instance, I would never consider putting the "bat and ball" problem in a Turk study. It's too widely-used and it's highly susceptible to practice effects. You cannot give this to non-naive participants, and you cannot safely treat Turkers as naive participants anymore. I'm sort of appalled that anyone thinks otherwise at this point in time.

Similarly, I'd never consider Turk as a solution to problems of representativeness. Although Turkers are actually a lot more diverse than our undergrad population, they're still a pretty narrow slice through the general populace. If that matters for your research question (it doesn't always) then you need to attach a lot of caveats to anything you do with Turk.

But not everything runs afoul of these limitations. I find that I have quite a lot of studies that involve short tasks that are rarely (if ever) used in the literature, and don't generally require highly representative samples because the qualitatively important characteristics of the data are more or less invariant across people. For those things, Turk works really well.
posted by mixing at 4:35 PM on February 16, 2015 [2 favorites]

The recommended wage on mturk is 10 cents per minute

That's only $6 and hour, and more than a dollar lower than the US minimum wage. Is it worth it?
posted by cosmic.osmo at 4:43 PM on February 16, 2015 [1 favorite]

That's only $6 and hour, and more than a dollar lower than the US minimum wage. Is it worth it?

Not to me, but I'm unemployed and right now it feels nice to make a few dollars each day even if it's not sustainable.
posted by wrabbit at 4:51 PM on February 16, 2015

Spamgirl, mixing, etc- this is great stuff, as a relatively lightweight MTurk user, your comments are helpful. Besides we are Dynamo, any suggestions as to what to read for further tips?
posted by blahblahblah at 4:54 PM on February 16, 2015

blahblahblah you're threadsitting in a kind of weird way
posted by ryanrs at 5:08 PM on February 16, 2015

oops. You are right. I think the weirdness is that (a) I have a professional interest and (b) there are lots of people here with more experience than me and I want to learn from them.

Very sorry, I should know better, thanks for pointing it out. Quiet now.
posted by blahblahblah at 5:34 PM on February 16, 2015

avoiding biased selection from your population should be your top priority

I think part of the problem is that social science academics see the usual alternative as just running experiments and surveys on populations of undergraduate students from their universities, so that they maybe grasp at the idea of Mechanical Turk as a less biased distribution route.

Also the question of pay skewing results is not specific to MT. We pay undergraduates usually around $10 to do a survey or experiment, or at least give them a coffee voucher. Or in some universities they get course credit for it. Either way, it leads (potentially at least) to people just showing up to get the money or the credit, without caring about giving honest responses or paying attention to the instructions.

I don't think either method is right, but I'm not convinced that all the problems with Mechanical Turk are new ones.

What does surprise me, though, is that ethics committees approve the use of MT. I'm pretty sure ours wouldn't. The low pay rate would worry them, and the fact you have no idea who is actually on the other end of the computer.
posted by lollusc at 6:14 PM on February 16, 2015 [1 favorite]

I also do experiments on Mechanical Turk, although I'm in cognitive science rather than social science. I think that makes a big difference , because our experiments (a) tend to be things like basic categorisation, learning new patterns, or similarity judgments, which should be similar among all humans; (b) are usually using new experimental paradigms that don't have analogues in the published literature and haven't been done before, so you don't get the same kind of distorting effects of "professional survey takers."

We also pay far above the average Turk standards. We do this partly because I wouldn't be able to look myself in the mirror otherwise, partly because we don't want to run afoul of ethics committees, partly because that is a good way to get good subjects: if you get a reputation as a poor requester then the good participants won't bother.

We also have attention check questions - on average 20% of people fail these. We usually do still pay them, for the reasons blahblahblah enumerates, but don't include those people in the final dataset: that removes a huge amount of bias or noise from the data. Even with that high of a failure rate, it's still worth it to run MTurk studies. First of all, unwilling and unengaged undergrads in my experience don't provide any higher quality of data (in fact it is often poorer) and they are much narrower demographically (in terms of age, education, and ethnic background at least). The main factor, though, is that you can get much larger sample sizes with MTurk and much more quickly, so you can ask far more interesting questions and be far more confident that whatever result you have is at least statistically robust.

It's not a cure-all, and I think there are many ways to do MTurk wrong (many of them detailed in the FPP) but by no means is it a no-brainer to think that MTurk is always worse than the alternative either.
posted by forza at 6:52 PM on February 16, 2015 [2 favorites]

As an undergrad, years ago, you had a required "donation" minimum of 2 hrs time per credit-hour to the psych experiments (per enrollment in the psych class). Guaranteed participation, but I sure went through it gritting my teeth with a wtf, whatever gets me out of here faster attitude.

I recall this every time I read some new "groundbreaking" psych research, and doesn't speak well for any future of the science. Deservedly so. (since even the 101 prof notes "we don't go back and review if previous results are legit", which I believe was covered on the blue where some group was trying to go back and replicate studies/papers, and having a really crappy success rate.. )
posted by k5.user at 6:55 PM on February 16, 2015

"Is it worth it?"

My mturk account information tells me that I did 142 tasks, with only 1 rejected, for a whole $10.31. Paid for a couple ebooks.
posted by Eyebrows McGee at 7:21 PM on February 16, 2015

Something about the whole thing sounds very Snow-Crash-BladeRunner. Just the name alone has always made me double-take inside my brain-box.

How many automated mechanical turkers be out there? How many "benefit of the doubt" "non-naive" "not paying attention" survey takers are having how much influence when they aren't weeded out?

We already know that poisoning survey data is a very slick way of subtly ruining a thing. Like a company or a study. Now, add weird cyberpunk seasoning
posted by aydeejones at 7:36 PM on February 16, 2015

Anyone remember All Advantage, BTW? I scammed it for 5 bucks a month or something before I got busted. This is much more meaningful work with a wider breadth, but it gives me a similar vibe. I like the idea of people making this money in countries where it really makes a difference, but am mostly ambivalent.

I got a free computer through a similar outfit to "AllAdvantage" (we pay you to surf the web!) called "FreePC" that was later bought out by e-Machines before my year of using-a-computer-with-ads-everywhere ran out. Naturally it was the only computer I ever owned (a Compaq) that actually failed miserably when I needed it most (well, my future wife, for a research paper she ended up doing in my dad's basement, that was weird).

I got a similar vibe when MT debuted, and then was like HOW CAN I GAME THIS and realized I would have to be a cyber-punk and have bad faith and all that jazz.

Also reminds me of the story of the guy who outsourced his programming job to Asia and sent the guy his RSA encryption key fob.
posted by aydeejones at 7:40 PM on February 16, 2015

Thorzdad: "No, no, no. This is Disruption!
It'll all be fine.

Really!

Trust the geeks."

To be fair, Turk was not designed for social sciences. It was primarily designed to be an automation marketplace. The documentation on Turk I read made to sure to indicate that you should be doing some base validity checks and handing out the same HIT to multiple people for verification. The tasks I saw were things like 'find the property tax paid' or 'is this a photo of 2144 52nd street'. Things with an objective ground truth that you can use to rate participant's honest and avoid collusion. If you're upset that people are writing software to solve your HITs more efficiently, well, it came with an API for a reason.

That said, I really liked the Turk study on passwords, which imputed to be a study on learning, but was more a memory test of your password.
posted by pwnguin at 7:48 PM on February 16, 2015 [1 favorite]

forza: "First of all, unwilling and unengaged undergrads in my experience don't provide any higher quality of data (in fact it is often poorer) and they are much narrower demographically (in terms of age, education, and ethnic background at least)."

Been wondering lately, what about the unemployment office? I realize there's some complex interactions with the UI and some demographics skew, but the demographics aren't so bad when compared with ugrad freshmen, and perhaps some form of UI exemption could be made In The Name Of Science for surveys less than 1 hour and paying less than X dollars administrered at the unemployement office or something.
posted by pwnguin at 7:53 PM on February 16, 2015

The name comes from this supposedly chess-playing automaton in the 18th century, where really the playing was done by someone hidden inside the built-in table using special magnets. I'm not sure what it says that your test subjects are coming from a source named after a famous scam that fooled Napoleon, among others.
posted by carolr at 8:25 PM on February 16, 2015

And yet, it is, and this is why mTurk is not a good place for academic research. If I'm doing your survey, I'm not doing HITs that pay a fair wage and ensure I feed my children or pay my mortgage.

It is going to select for those who don't have something better paying at that moment. Unless you're forced to take them, you should probably choose something more profitable for yourself. (Or maybe you can get Koch funding to promote an agenda!)
posted by five fresh fish at 9:38 PM on February 16, 2015

How long until one is able to train computer programs using MTurkers to have human gut reactions?
posted by waninggibbon at 9:46 PM on February 16, 2015

Been wondering lately, what about the unemployment office? I realize there's some complex interactions with the UI and some demographics skew, but the demographics aren't so bad when compared with ugrad freshmen, and perhaps some form of UI exemption could be made In The Name Of Science for surveys less than 1 hour and paying less than X dollars administered at the unemployment office or something.

As a PI nudging my social science research (which should be really fucking obviously exempt) through IRB this week, the very idea sends cringes of fear through my body. I have made it my life's mission never to do research involving prisoners, the elderly, or children because it's not worth it. When you take the required compliance training, it's basically a shocking reminder how much exploitation has been done in the name of research over the years (also the Belmont Report and some very boring scenario videos).

So therefore, I can't imagine my IRB--yours may be more rational--giving anything other than the hairiest eyeball to the unemployed as research subjects.
posted by librarylis at 9:51 PM on February 16, 2015

Definitely check out the Planet Money piece on Mechanical Turk; they put out a task with a secret message, and interviewed about thirty mechanical turk workers. They seemed to have a heavy helping of people who aren't able to hold other jobs, or are from outside the US, which is probably both obvious and interesting. Upping the wages sounds like a great idea; the per-minute rate shouldn't be allowed to fall below minimum per-hour wage...

I can't imagine relying on it for social science surveys, but it seems fantastic for producing labels for machine learning systems.

I saw a pretty fascinating talk from a Berkeley student last week about optimizing MTurk usage. It fell in two parts: One on explicitly estimating error bounds on the responses received, and the other on doing demographical analysis to find subsets of workers who give better results. The latter has me envisioning a dystopic future where one needs a university degree to label images...

My favorite part of the talk, though, was the note that if you can identify malicious workers, you can still gain information from their wrong answers. (Especially for binary classification problems!)
posted by kaibutsu at 10:10 PM on February 16, 2015

It occurs to me that when driverless cars come along, FedEx drivers will become MTs who also handle parcels at times scattered throughout the day.
posted by five fresh fish at 11:20 PM on February 16, 2015

I don't feel like I am exploiting people in this situation

This genuinely surprises me, and I wonder if it's a cultural difference between academia and non, or coming from a country (Australia) with a minimum wage that until recently was around 19 US dollars or what.

But for me, paying someone below the minimum wage in my country, makes me feel pretty iffy (I acknowledge the hypocrisy of this when it comes to buying clothing and other consumer goods).

I think it's bullshit when people do it here with fruit picking based on number of baskets, or commissions-based stuff like signing people up at the gym, or paying people for amount of junk mail delivered, etc that effectively pays >90% of workers below the minimum wage.

I find the defense that people are doing it for funsies or whatever pretty iffy, too. That may well be the case, but they are being paid as - being treated as - employees. If it was for funsies make it for free or give them vouchers that are clearly not a wage substitute.

But then, you get into the whole murky issue of subcontracting etc which is a whole other problematic area.

I would be uncomfortable knowing that I was directly perpetuating a system structured to pay my fellow citizens less than the minimum wage. I don't want to get soppy, but people fought and died for minimum wage; I'd hate to think I was contributing to its erosion both as a buyer, and someone who spent many years on the minimum wage.
posted by smoke at 2:19 AM on February 17, 2015 [7 favorites]

As a PI nudging my social science research (which should be really fucking obviously exempt) through IRB this week, the very idea sends cringes of fear through my body. I have made it my life's mission never to do research involving prisoners, the elderly, or children because it's not worth it.

I honestly don't understand how using undergrad psychology students gets through IRB. I realise there's usually some option on offer that enables you to opt-out of participating and still get the points for the class, but it inevitably sufficiently dull-sounding if not sufficiently arduous that being a research subject sounds more appealing, which feels rather coercive.
posted by hoyland at 4:29 AM on February 17, 2015 [1 favorite]

I serve on or observe on a couple of university IRBs, one of which pretty regularly grapples with mTurk issues. There's probably not that much I can say about that publicly but I think it's safe enough to just note that this is very much an active area of discussion about research ethics, the policies on mTurk and similar sites are evolving with each new example the board sees, and everyone's trying to figure out what the right answers are to protect research participants without stifling scientific investigation.

An interesting flip side of this issue that we're grappling with, is that it's great that mTurkers have built some structure outside of Amazon to share information about good and bad researchers - but this can be really problematic when they are not very rigorous about who they're sharing information about. We had a pretty bad situation a while back when someone at another university, sharing a name with one of our researchers who also does mTurk work, was treating his participants badly. Someone on one of these information-sharing sites Googled badly and decided it was our researcher who was doing the bad stuff, and whipped up a frenzy that stretched into real life, not just angry e-mails. We eventually proved to the person in question that they'd finger-pointed at the wrong researchers and they very kindly retracted their complaint publicly, but that didn't fully mitigate the damage done.

So I'm observing the IRB working really hard to figure out how to protect mTurkers from researchers, and also researchers from mTurkers, without much in the way of guidance or best practices yet from any of the authorities in this area. It's an interesting and somewhat nervewracking area.

On the psych undergrad front, lo these many years ago, I was much happier to do some research than the alternatives, for sure. I think some of what makes it acceptable for at least some IRBs is that the students don't truly have to *do* the research. At least here, all the student has to do is show up for credit. If they for any reason do not want to participate in the actual research, even if that reason is just "I'm only in it for credit", they can just decline and walk away and still get credit.
posted by Stacey at 5:33 AM on February 17, 2015 [1 favorite]

The first time I did a MTurk survey, I tried not paying several people who I felt were being exploitative (getting questions wrong AND doing the survey sloppily by doing things like answering all questions the same way). Almost all wrote me elaborate and angry emails explaining how they never messed up before and how I was being a jerk. It absolutely wore me out, and I had to argue with them that they didn't take the survey seriously, resulting in hour long fights over $.50!

Fighting an hour over fifty cents? Doesn't Bangladesh pay better wage rates than that?
posted by jonp72 at 7:07 AM on February 17, 2015

I vowed to not thread sit, but the last couple posts that quoted me above about exploitation made me sound like a Koch brother. To be clear, for all the reasons I stated, I pay at least as much per hour as I would for lab subjects, though it can be tough estimating timing - survey takers are faster than you expect and Turkers usually end up getting paid relatively more. I'll leave the discussion of the (interesting) larger issue of exploitation and the long-term effect on wages for a future thread where I have not already chimed in too much.
posted by blahblahblah at 8:30 AM on February 17, 2015 [2 favorites]

librarylis: "I have made it my life's mission never to do research involving prisoners, the elderly, or children because it's not worth it."

To be clear, I wasn't asking for psych research to be a mandatory of UI, merely a supplemental source of income that wouldn't interfere with UI benefits, and made as easy as possible to participate in. I realize that money is coercive, and that the unemployed respond better than average to money incentives. And seeking out the unemployed has some potential adverse political interactions.

It seems Mechanical Turk is giving researchers the option of pretending not to know that my suggestion is what they're doing in essence. Or at least the IRB.
posted by pwnguin at 9:06 AM on February 17, 2015

« Older Above the Tsunami Inundation line? Check! Volcano... | Inside the Koch Brothers' Toxic Empire Newer »

This thread has been archived and is closed to new comments

MetaFilter

Science meets professional subjects
February 16, 2015 1:26 PM Subscribe

Tags

Share

Science meets professional subjects February 16, 2015 1:26 PM Subscribe

Tags

Share

Science meets professional subjects
February 16, 2015 1:26 PM Subscribe