One of the questions my colleagues and I have discussed several times is the use of a screening survey. I highly encourage using a screening survey, and I explain why below.
Fairness to Workers
In one of our studies, we wanted to recruit people who were at least 18 years old, were currently employed (but not self-employed), and worked at least 38 hours per week. We used a screening survey that paid 5 cents, and took about 30 seconds to answer. Approximately 1,600 people completed it, and about 38% of them qualified to take our second survey, which included the items for our actual research study. Without a screening survey, about 990 individuals would have been able to complete the survey but would not have qualified to take it in the first place based on their demographics.
If we had not used a screening survey, we would have had to set the parameters of the survey to disqualify people based on their responses. The problem with this is people are not being paid for their responses. Even if they only answered a few items and were disqualified, they still participated but are not compensated. We could have also asked people to self-select out of the study, but this provides little control. Participants may continue, for instance, even if they don't meet the requirements.
As a result, the screening survey also allowed me to maintain a positive reputation in the Mechanical Turk community by avoiding one common Requester mistake: expecting to pick and choose which workers are paid based on their responses to the surveys. You see, Workers expect to be paid for their time, regardless of whether their demographics fit those required by a research study. That is, researchers cannot survey 1,600 people, and only pay the 384 individuals who meet the requirements. It is appropriate to reject work based on quality issues, but not based on the fact that an individual, for instance, does not work 38 hours per week. This finding also has an implication for the use of logic in survey design. Some survey tools automatically disqualify workers based on their responses, and some may argue that this technique circumvents the need for a screening survey. However, people’s participation up until the point of disqualification should be rewarded. Thus, researchers using this technique should consider the psychological consequences of disqualifying participants as they progress through the survey (and after they have invested at least some time in the task).
The screening survey also benefited the data analysis process because less data had to be examined. In terms of the screening survey’s role in collecting high quality data, the screening survey was able to eliminate some of the participants who did not meet three of the decision criteria (e.g., work more than 38 hours per week; have more than two months tenure at one’s current employer; and not be self-employed). Interestingly, however, the survey was not able to eliminate all unqualified participants as a few of them provided inconsistent information on the two surveys. This interesting finding suggests some Workers may not respond accurately (for a variety of reasons), and the screening survey helps to identify these individuals. Overall, instead of paying $1,600 dollars to collect an appropriate sample size, the present study paid about $550 (1600 participants x $.05 + 475 participants x $1.00).
The second interesting observation is the speed at which data is collected on MTurk can be impacted by a variety of factors, including pay rate, number of available HITs, and pool of qualified workers. During the pilot study, for instance, it became obvious that the initial pay rate of $.01 for the screening survey would not be enough to attract participants. In addition, it was obvious that the initial criteria established for MTurk Workers to be able to view the HIT was too strict as only participants who had completed 1,000 HITs were granted access. Apparently, there were relatively few people who could meet the criteria, and data collection was estimated to take longer than expected. Thus, the criteria were relaxed for the actual study. Admittedly, doing so may have lessened the quality of data. However, it is reasonable to expect that people who work at least 38 hours per week may not have the time to complete 1,000 HITs before being able to complete the questionnaires.
Another potential factor influencing data collection speed is the timing of data collection. Data for our study was collected from November 2012 to January 2013, spanning several popular U.S. holidays (e.g., Thanksgiving, Christmas, Hanukkah, New Year's). Approximately half of the data was collected during the seven days after the Christmas holiday (December 26-January 1). During these holidays, employees are known to take paid and unpaid vacations from work, providing extra time to complete HITs on MTurk. In addition, most of the participants responded during the evening hours and on weekends (in U.S. time zones). These findings are not surprising considering participants were required to be working full-time, and the majority of participants likely worked during the day Monday through Friday. Thus, researchers who use MTurk or other online applications may want to consider holidays and work schedules when planning the data collection process.