Friday, May 04, 2012

save money on crowd labor? build brand affinity!

As many of you know, my off-hours project over the past few months has been a deeper analysis of the connection between worker engagement and brand affinity in a task-based labor market. Based on my hands-on experience with crowdsourcing solutions and literature review, I have arrived at the hypothesis that, in a task-based market, brands with stronger opinion / affinity ratings will have a labor cost advantage over lower performing brands. This hypothesis is based on what I perceive as a functional equivalence between brand preference in purchasing models (where preferred brands experience higher pricing power and/or rates of repurchase) and task selection models in the rapidly evolving task-based labor market. It is my opinion that brand preference will exhibit as a positive task selection bias confirmed by lower costs for equivalent work.

The research population for this directional research was a community of highly-rated workers on Amazon Mechanical Turk. A total of US-based 720 “Turkers” (as they refer to themselves) with completed task approval rates of greater than 90% and more than 500 tasks completed responded to the survey tool. Additional details on the survey methodology are covered in a prior post. Note that Amazon does not make available worker population counts for Mechanical Turk, so meaningful evaluations of statistical confidence are challenging.

The survey tool (screen capture available here) was constructed to require a participant to select between two theoretical tasks on Amazon Mechanical Turk. The participant was asked to make their choice based on the assumption that the tasks were effectively identical, and that the only difference between the two tasks was the identity of the task requestor. For the requestor, participants were asked to choose from two (2) randomly selected, well-known consumer brands. Participants were also asked how much the requestor of the non-selected task would have to pay in order to have their task chosen over the preferred brand (based on the assumption that their preferred brand was offering $1.00 to complete the task). Finally, participants were asked to provide their opinion of each brand on a Likert scale, ranging from “1 – Poor” to “5 – Excellent”.

The survey model was designed to provide a cost differential between the two (2) brands. This variable provides a measure of how much more the non-preferred requestor would have to pay in order to have their task completed instead of the preferred brand. The following histogram classifies the responses based on the cost differential reported by the respondent.

Cost Variance Histogram

As you can see, Turkers expect significantly more to complete tasks for a non-preferred requestor. Over 77% of respondees expect more than double (i.e., the categories including, and to the right of, “1.00 to 1.99 Additional”) to complete a task for a non-preferred requestor. While this level of difference may not hold over time as task-based markets become more mature, the current environment clearly requires a significant cost premium for those brands lacking in affinity.

Observation: Preferred brands pay a lot less, even when opinions are equivalent

My first exploration involved looking at just those situations where the participant gave both brands the same opinion score (e.g., they rated both “5 – Excellent”). Out of the 720 responses, 215 fell into this category. However, since the participant was required to choose one brand over the other and then provide a cost differential, it was possible to identify what a basic “preference” would cost a company. Based on the instruction that their preferred brand was offering $1.00 to complete the task, participants reported that they would expect an average of $2.91 from the non-preferred brand to complete the task (a cost differential of $1.91). This is nearly triple the cost to the preferred task requestor.

This is a striking result; when a Turker has to choose between two functionally equivalent tasks, requestor preference becomes a significant factor. And, even when the Turker holds similar opinions of both requestors, the preferred requestor has the potential for their work to be completed at much lower cost.

Observation: Brand opinion has a role to play, but only to an extent

Extending the initial analysis, I established the concept of “Opinion Distance”, a variable defined as the absolute value of the difference in opinion scores between the two (2) brands presented to the respondent. Based on this definition, Opinion Distance can range from “4” (i.e., Brand A = “5 – Excellent” and Brand B = “1 – Poor”) to “0” (i.e., both brands have the same opinion score). Responses were then categorized and averaged based on their Opinion Distance value.

As the following chart indicates, any difference in brand opinion (Opinion Distance greater than zero) results in a significant cost increase over and above the basic preference differential identified in Result 1. At an Opinion Distance of one (1), a Turker expects to receive an average of $3.58 to complete a task for which the preferred brand would pay $1.00, more than tripling the cost to the requestor.

Cost Variance vs. Count of Opinion Distance

Of particular interest is that there appears to be an implicit maximum premium that Turkers are willing to charge for a given task, regardless of brand opinion. While the average reward expectation differentials for Opinion Distance of two (2) and above are greater than basic preference (Opinion Distance of zero), they are actually less than responses with an Opinion Distance of one (1). Additional analysis needs to be performed, but it would appear that worker value models establish a reasonable maximum reward for a given task. The survey included a text response area that I am in the process of reviewing—it is hoped that this unstructured data will provide additional insight and direction for additional research.

Conclusion and future research directions

Based on this initial research, there would appear to be reasonable support for the established hypothesis. Additional analysis of the initial dataset should provide confirmation, and further results will be shared as the analysis is completed. I am also making the dataset available for download to anyone who would be interested in conducting their own analysis and contributing insight. You can download the Excel-formatted file here.

As mentioned, the results have provided multiple avenues for further investigation. Several that are of immediate interest are:
  1. Investigating the difference between preference and opinion as it relates to brand affinity in a task-based labor market, and identifying the significant components of each;
  2. Better understanding the mental models by which task workers value their effort and determine which tasks are “worth” completing;
  3. Identifying if there are significant differences in response patterns when crowd labor is a primary / exclusive source of income (vs. a contributing / optional source).
Finally, it should be noted that this research is not intended to be authoritative, but to establish initial direction for additional, more rigorous studies. In order to extend the validity of these findings, crowd workers from other platforms would need to be included, as well less qualified workers (e.g., lower task approval rates). However, the research does provide important insights to businesses seeking to leverage the most qualified crowd worker pool on today’s largest crowd worker platform.