A paper titled “Designing Incentives for Inexpert Human Raters,” discusses an experiment which analyzed 14 different incentive structures for encouraging high-quality work from Mechanical Turk-style systems. Researchers found that workers perform most accurately when the task design credibly links payoffs to a worker’s ability to think about the answers that their peers are likely to provide.
Roughly 2,000 individuals participated in the study, resulting in over 100 subjects in each of the experimental conditions. Workers answered five questions which had been drawn from a study for which validated data existed
The two most successful incentive methods were the “Bayesian Truth Serum” (BTS) and “Punishment – disagreement” conditions, each of which improved average worker performance by almost half of a correct answer above the 2.08 correct answers in the control group.
The BTS incentive method asks workers to not only perform the work themselves but to also predict the distribution of other workers’ responses. They are told they will receive a bonus if predictions are correct. The other successful method works by punishing workers for disagreement.
The authors acknowledged that because requesters on MTurk have little oversight, workers are more likely to respond to financial incentives than stated promises. In this sense, the marketplace has structured the interaction between workers and requesters in a way that may limit the opportunities to harness motivations that are not linked to money in some explicit way.
The study is an interesting parallel to another recent study by the National Academy of Sciences in which research conducted with Swiss students found the crowd only produced positive results when participants were not kept informed of others’ opinions. If participants were told other people’s opinions during the experiment, then the wisdom of the crowd fell short.