Tuesday, December 16, 2014

Response Rates in the 2011 National Household Survey and the 2006 Census

Question

I see the global non-response rate (GNR) from the NHS home page for Canada <http://www12.statcan.gc.ca/nhs-enm/2011/dp-pd/prof/details/page.cfm?Lang=E& Geo1 =PR&Code1=01&Data=Count&SearchText=canada&SearchType=Begins&SearchPR=01&A1=All&B1=All&Custom=&TABID=1> is 26.1%. However, I also see frequent reference to a response rate of 68% (e.g., http://www.chamber.ca/download.aspx?t=0&pid=f9d85161-2e65-e411-a071-000c29c04ade, page 67).

Why and how do these two numbers differ? What are they measuring? What were the comparable numbers for the 2006 long-form census?


Answer

There are 2 different rates being referenced by your client. These are the survey response rates and the Global non-response rates (GNRs) of the questionnaire. Here are some explanations that should clear up the confusion between the 2 different concepts and some information about the 2006 2A and 2B long form rates:

From the 2011 NHS Guide <http://www12.statcan.gc.ca/nhs-enm/2011/ref/nhs-enm_guide/guide_2-eng.cfm>:

Survey response rate

The response rate, which is the ratio of the number of questionnaires completed to the total number of occupied private dwellings in the sample, is 68.6% for Canada, all collection methods combined. This is similar to the response rate for other voluntary surveys conducted by Statistics Canada.

Since the NHS sampling design includes a subsample for non-response follow-up, a weighted response rate that takes this subsample into account is needed to get a better idea of the quality of the NHS data. In the calculation of the weighted response rate, the households in the subsample that responded to the NHS represent not only themselves but also the non-respondent households that are not in the subsample.

Note: The response rates are based on the NHS's final sampling weights. The initial sampling weight of the dwellings that responded to the NHS before a specific date during the collection period is equal to the sampling fraction in their area. The dwellings that were in the non-response follow-up subsample and responded were assigned a larger weight to compensate for non-response. The weighted response rates are calculated as follows: the weighted number of sampled private dwellings that returned a questionnaire divided by the weighted number of sampled private dwellings classified as occupied.

The overall response rate for the 2006 Census was 96.5%. The rate for the 2A, or short form, was 97.2 % and for the 2B, or long form, it was 93.7 %.

And this explains the Global Non-response Rate (GNR):

The global non-response rate was calculated in order to determine whether the data for a geographic area is of sufficient quality to be released as it is an important measure of the quality of NHS estimates. It combines household and item non-response, as such it reflects the risk of non-response bias. This measure was also used to decide when to disseminate counts for a given geographic area for the 2011 Census, just as it was used in the 2006 Census for the dissemination of short form counts and long form sample estimates.

In the specific case of the NHS, the global non-response rate is weighted to take account for the initial sampling and the sub-sampling prior to non-response follow-up. The global non-response rate was calculated as the ratio of two weighted estimates for a given geographic area. The numerator of the ratio is an estimate of the total number of questions for which no response were obtained over all households (i.e. respondents and non-respondents) in the given geographic area. The denominator of the ratio is an estimate of the total number of questions for which responses were expected over all households (i.e. respondents and non-respondents) in the given geographic area.

For the 2011 Census and the 2006 Census (short and long forms), the global non-response rate was un-weighted. The global non-response rate was calculated as the ratio of two counts for a given geographic area. The numerator of the ratio is a count of the total number of questions for which no response were obtained over all households (i.e. respondents and non-respondents) in the given geographic area. The denominator of the ratio is a count of the total number of questions for which responses were expected over all households (i.e. respondents and non-respondents) in the given geographic area.

As for the 2006 Census the GNR was not published as a rate. Instead data quality indicators (commonly referred to as data quality flags) were attached to each standard geographic area disseminated. In the 2006 Census database environments, the data quality indicators consist of a five-digit numeric field. On the database and in electronic products browsed via Beyond 20/20, these flags are displayed as a five-digit numeric code (example: 0 2 1 3 1). The following link shows the breakdown of this 5-digit code: Data quality and confidentiality standards and guidelines (public): Data quality practices <http://www12. statcan.gc.ca/census-recensement/2006/ref/notes/DQ-QD/Appendix_B-Annexe_B-eng.cfm>. These areas are flagged on the database according to the non-response rate. Geographic areas with a non-response rate higher than or equal to 25% are suppressed from tabulations. Geographic areas with a global non-response rate higher than or equal to 5% and lower than 25% are broken into 2 categories and are flagged according to the following ranges: falling between 5% and 10% and falling between 10% and 25%. These geographic areas are identified in tabulations, but not suppressed.