Skip to content
This repository has been archived by the owner on May 24, 2019. It is now read-only.

Circumventing Batch Pricing

Thomas J. Leeper edited this page Jun 6, 2016 · 10 revisions

In July 2015, Amazon increased the price of MTurk HITs that include 10 or more assignments. In short, MTurk charges an additional 20% commission on top of the base commission when a HIT contains 10 or more assignments. This disproportionately affects academic requesters doing large-n survey-experimental research. This tutorial explains how to post a sequence of HITs with 9 (or fewer) assignments in order to obtain a completed number of assignments greater than 10 without incurring this extra charge.

The basic strategy will be to create a series of HITs, each of which (except for the first) has a QualificationRequirement that prevents workers who have done one of the previous HITs from completing an assignment. The example below implements this as a repeat loop in R that is fully self-contained. You may need to modify it for your purposes.

Note, in particular, that this example uses an "ExternalQuestion" setup, which requires that your survey tool be able to post back to the MTurk ExternalSubmit URL correctly. If you're not sure if your tool can do that or are unable to configure it correctly, use the "Link and Code" method for connecting MTurk to your off-site survey tool.

# set total number of desired assignments
total <- 1000

# create QualificationType
qual <- CreateQualificationType(name="Already completed HIT",
          description="Already completed identical HIT before.",
          status = "Active")
# generate "DoesNotExist" QualificationRequirement structure
qreq <- GenerateQualificationRequirement(qual$QualificationTypeId, "DoesNotExist", "")

# create HITType w/o qualification requirement
hittype1 <- RegisterHITType(title = "10 Question Survey",
                description = "Something something something",
                reward = ".20", 
                duration = seconds(hours = 1), 
                auto.approval.delay = seconds(days = 1),
                keywords = "survey, questionnaire")

# create HITType w/ qualification requirement
hittype2 <- RegisterHITType(title = "10 Question Survey",
                description = "Something something something",
                reward = ".20", 
                duration = seconds(hours = 1), 
                auto.approval.delay = seconds(days = 1),
                keywords = "survey, questionnaire",
                qual.req = qreq) # this blocks past workers

# create first HIT
eq <- GenerateExternalQuestion("https://www.example.com/","400")
hit <- CreateHIT(hit.type = hittype1$HITTypeId,
                 assignments = 9, # IMPORTANT THAT THIS IS <= 9
                 expiration = seconds(days = 4),
                 question = eq$string)

# variable to index number of completed assignments
completed <- 0

# list to store assignments into
allassigns <- list()

# Number of assignments per iteration
# If >9, will get charged 40% instead of 20%!
assignmentsPerBatch <- 9

# start the loop
repeat {
  g <- GetHIT(hit$HITId, response.group = "HITAssignmentSummary", 
              verbose = FALSE)$HITs$NumberOfAssignmentsPending

  # check if all assignmentsPerBatch have been completed
  if (as.numeric(g) == 0) {
    # if yes, retrieve submitted assignments
    w <- length(allassigns) + 1
    allassigns[[w]] <- GetAssignments(hit = hit$HITId)
    
    # assign blocking qualification to workers who completed previous HIT
    AssignQualification(qual$QualificationTypeId, allassigns[[w]]$WorkerId, verbose = FALSE)

    # increment number of completed assignments
    completed <- completed + assignmentsPerBatch

    # optionally display total assignments completed thus far
    if (getOption("MTurkR.verbose")) {
      message(paste("Total assignments completed: ", completed, "\n", sep=""))
    }

    # check if enough assignments have been completed
    if(completed < total) {    
      # if not, create another HIT
      hit <- CreateHIT(hit.type = hittype2$HITTypeId,
                       assignments = assignmentsPerBatch,
                       expiration = seconds(days = 4),
                       question = eq$string)

      # wait some time and check again
      Sys.sleep(180)
    } else {
      # if total met, exit loop:
      break
    }
  } else {
    # wait some time and check again
    Sys.sleep(30) # TIME (IN SECONDS) TO WAIT BETWEEN CHECKING FOR ASSIGNMENTS
  }
}

# get all assignments for all HITs as a data.frame
m <- do.call("rbind", allassigns)

This will run by itself until the total number of assignments is reached.

Clone this wiki locally