I was asked a question about the differences between the 2011 census and previous censuses with respect to questions on language. The major difference between the 2011 and 2006 census on language was that the language of work question was relegated to the NHS.
1. How were the responses to questions that appeared in both the 2011 and the NHS handled?
a) Were the NHS responses ignored for common variables?b) Why were they asked twice in any case, since the census was still compulsory?c) If they were asked twice, what happened if the responses were different between the two surveys?d) For NHS datasets that include variables that appeared in both surveys (e.g., age groups), were the NHS responses used, or the census responses?2. Are NHS response rates for individual questions available, as opposed to the global response rates for particular geographies?
Answer
1. How were the responses to questions that appeared in both the 2011 Census and the NHS handled?
(a) Were the NHS responses ignored for common variables?
When responses for both questionnaires were processed, there was a harmonization procedure applied.
When the household was the same for both the Census and NHS, some rules were put in place to compare the number of fields responded for the common variables (i.e. demography and languages).
- First step: demography variables were compared. If the number of fields responded on the Census was better than the NHS, the Census responses were kept for both the Census and NHS (i.e. Census demography responses were copied to NHS, so the original NHS responses were not retained). Otherwise, original responses were kept for both questionnaires.
- Second step: if responses to demography variables were copied to NHS then the number of fields responded for language variables on the Census and NHS were compared. If the number of fields responded was higher on the Census than the NHS, the Census language responses were kept for both the Census and NHS (i.e. Census language responses were copied to NHS, so the original NHS responses were not retained). Otherwise, original responses for languages were kept for both questionnaires. If original demography responses were kept on both questionnaires, then original language responses were also kept.
The copy of responses from Census to NHS occurred for very few questionnaires. About 5.9% of persons on the NHS database had their demography responses modified and 1.2% had their language responses modified.
When the households were not the same for both the Census and NHS, the original responses were kept for both questionnaires.
(b) Why were they asked twice in any case, since the census was still compulsory?
The decision was made to conduct the voluntary NHS during the same period as the census in order to take advantage of census resources and infrastructure such as collection management systems and employees. Questions planned for the long-form remained unchanged, including the language questions, for logistical and methodological reasons given the late decisions to change the 2011 program. Persons selected for the NHS and who responded to the census on the Internet were given the opportunity to complete the NHS online immediately after finishing the census questionnaire. When they did follow-on with the NHS, responses to common variables were automatically copied over and were then the same for both surveys. When they did not follow-on with the NHS right away, they were again offered to respond at a later time and if they did accept to respond to the NHS, a second set of responses was then obtained for the common variables. Other persons selected in the NHS sample chose to respond to the Census on paper. They were later contacted again and offered to complete the NHS and if they did accept to respond to the NHS, a second set of responses was then obtained for the common variables.
(c) If they were asked twice, what happened if the responses were
different between the two surveys?
Please see answer to question (a).
(d) For NHS datasets that include variables that appeared in both
surveys (e.g., age groups), were the NHS responses used, or the
census responses?
Please see answer to question (a).
Regarding response rates:
Response rates for individual questions are NOT available.
Imputation rates for individual questions are available in the different Census / NHS references guides at the national, provincial and territorial levels. They are not available for lower levels of geography.
The imputation rate is the proportion of respondents who did not answer a given question or whose response is deemed invalid and for which a value was imputed. Imputation improves data quality by reducing the gaps caused by non-response.
2011 NHS Reference guide for Language questions
http://www12.statcan.gc.ca/nhs-enm/2011/ref/guides/99-010-x/99-010-x2011007-eng.cfm#a5