Most surveys employ quotas, sometimes referred to as strata or stratification, to ensure that the data is representative on key variables.
A study of 2,000 energy households used the following quotas. The key logic behind the quotes in this example is that the structure of a household is a key determinant of its energy usage patterns and levels and thus ensuring that the sample is consistent with these quotas provides some assurance that the survey will accurately capture the true variation between households in terms of their energy usage.
|Couple - youngest child aged 11 or under||466||23%|
|Couple - youngest child aged 12 to 17||186||9%|
|Couple - children at home aged 18 or more||128||6%|
|One parent family - youngest child aged 11 or under||120||6%|
|One parent family - youngest child aged 12 to 17||56||3%|
|One parent family - children at home aged 18 or more||46||2%|
|No Children - 18 to 24||140||7%|
|No Children - 25 - 34||238||12%|
|No Children - 35 - 44||158||8%|
|No Children - 45 - 54||208||10%|
|No children - 55 or more||254||13%|
Interlocking and non-interlocking quotas
The quotas shown above involve multiple variables: household structure and age. Such quotas are referred to as interlocking quotas. An alternative approach is to have separate quotas by each variable, non-interlocking quotas, for example:
|Mainly use PC||100|
|Mainly use Macintosh||100|
|Aged under 30||66|
|Aged 30 to 50||67|
|Aged more than 50||67|
Although popular, non-interlocking quotas, which are also known as overlapping quotas, are usually a bad idea.[note 1] Most market research is conducted using online panels (databases of people who participate in market research studies in return for payment or entry into prize draws). These panels are generally massively skewed, over-representing the young, white collar workers and the inert. Quotas are used to correct for these skews. However, the use of non-interlocking quotas actually creates further problems with the data.
If using these quotas and randomly selecting respondents from an online panel, there is a good chance that by the time 150 interviews have been conducted, the PC quota will be filled (because many more people in the population use PCs so the quota will fill faster) and the two younger quotas will be filled (because most people in online panels are younger). Thus, these quotas may result in the last 50 respondents being Mac users aged more than 50. The resulting data will seem to reveal a strong correlation between age and Mac usage, whereas the reality is that the correlation has been created by the use of overlapping quotas.
A natural conclusion to reach from this example is that we should have large tables of interlocking quotas. For example, hundreds of quota groups defined by age, gender, household structure, geography, and any other key variables. Unfortunately, this is impractical. We generally do not have data to permit us to work out how many customers we have aged 18 to 24 living in NSW in a group household with no children, which makes it impossible to create good quotas. Further, even if we could create the quotas, large numbers of quotas are very expensive, as there always ends up being a few quota groups that are extraordinarily hard to fill.
Proportional versus non-proportional quotas
Most quotas are proportional, which is to say that the number of respondents required in each group is determined by the number of people in the population believed to be in each group (e.g., based on ABS published statistics). Sometimes the quotas may be non-proportional to ensure enough people in a key segment are in the study (e.g., if doing a study for Apple, you may want 50% of the sample to be Apple customers); if employing non-proportional quotas the data needs to be weighted and this requires that you know the correct population proportions for weighting.
Online interviewing and send quotas
Developing quotas for online interviewing requires an additional step to that described above. In addition to creating quotas relating to how many people complete the survey (which is what the above descriptions relate to) there is additionally also a need to create quotas relating to the number of people invited to participate. These should always be interlocking and it is generally advisable to create a large number of quotas (e.g., creating quota categories of 5 year age bands by gender by geography).
Some online panel companies refer to such quotas as weighting.
A good quality online study requires both quotas of the sends and quotas on the number of completes. Where quotas are only provided on the number of completes it can lead to bizarre results. For example, in one study which used the age bands of Under 30, 30 to 50 and 51 or more, almost all of the respondents in the 51 or more category were 51 years old, meaning that the survey had no representation of older consumers. When quotas are used on the sends but not on the completes, the result tends to be an over-representation of older consumers, because they are more likely to respond to an invitation to participate.
- ↑ Prior to the advent of online interviewing, they were not too problematic, because the basic process of sampling was more in line with probability theory. When using older approaches to data collection, such as phone and mail, it can be appropriate.