A researcher has sent me a few more questions concerning the 2001 Census Individuals PUMF methodology. I have appended the entire text of the message below. The document referred to is the 2001 Census Public Use Microdata File: Individuals File: User Documentation, 2nd revision. The pages referred to are page 173-176 of 292 (page 168-171 in the document). I am quite confident that your census staff can answer these questions much better that I can. Thanks in advance for any information or advice that might be sent back.
"The canvasser area stratum is mentioned in the first phase of sampling in Chapter III Sampling Method and Data Quality (page 173 of the pdf file attached).
In part (b) Second Phase of Sampling, the strata used for selection of the individuals to be part of the public use microdata file (pumf) on individuals Canada Census 2001 are defined. One limitation of this stratification is mentioned that uses canvasser area stratum. However, canvasser area stratum is not defined as one of the strata available for stratification in phase two. This contradicts in my opinion the procedure used for defining each stratum in phase two of sampling (on page 176 of the pdf).
Was the canvasser area stratum used for stratification in phase two of the selection of pumf on individuals in Canada Census 2001? Is canvasser area stratum identifiable by any variable in the pumf on individuals in Canada Census 2001? Where does canvasser area stratum rank in importance compared to the rest of the variables used for stratification in phase two of sampling?"
The methodologists our census contacts consulted provided the following information in response to the researcher's questions:
"The file was sorted first by canvasser stratum and then by geography, and the other variables. By doing so, we were ensuring that these people (people living in a canvasser area) were well represented in the sample.
We limit the number of strata in these areas because the number of people within a given geography was already small (stratifying more would not do anything in these circumstances). It was not possible to identify them in the file for confidentiality reasons (canvasser areas have low population counts). However, we felt that, since those people are different characteristics than the other people, it was important to ensure their representation. In that sense, this stratum is very important."
I hope that this information is helpful. Please do not hesitate to let us know if you have any other questions.