A researcher is doing analysis on the FSA level. She is using the PCCF and wants to create a dummy variable for 0=rural and 1=urban.
Would these methods be accurate, and if so, which would be more accurate? If not, would you have any suggestions for her to recode this dummy variable?
(A) recode SACtype (6-8 = 0) and (1-5 = 1)
(B) recode POP_CNTR_RA_type (0=0) and (1-4, 6 = 1)
From the PCCF reference guide
SACtype
1 Census subdivision within census metropolitan area
2 Census subdivision within census agglomeration with at least one census tract
3 Census subdivision within census agglomeration having no census tracts
4 Census subdivision outside of census metropolitan area and census agglomeration
having strong metropolitan influence
5 Census subdivision outside of census metropolitan area and census agglomeration
having moderate metropolitan influence
6 Census subdivision outside of census metropolitan area and census agglomeration
having weak metropolitan influence
7 Census subdivision outside of census metropolitan area and census agglomeration
having no metropolitan influence
8 Census subdivision within the territories, outside of census agglomeration
POP_CNTR_RA_type
0 Rural area
1 Core
2 Fringe
4 Population centre outside CMAs and CAs
6 Secondary core
Answers
If you want to use the Canada Post definition:
For customers who have a rural address [e.g., Postal Code with a “0” (zero)] as the second character […] from <https://www.canadapost.ca/tools/pg/customerguides/Advance_CGbrm-e.pdf>
Read each character of the FSA as a single byte string (e.g.,Data list FSA1 to FSA3 1-3 (A) FSA 1-3 (A) … Value labels rururb 0 “Rural” 1 “Urban”. compute rururb=1. If (FSA2 = “0”) then rururb=0.)
The subject matter has also provided the following feedback:
It is somewhat challenging to provide meaningful feedback (deciding on A or B) without knowing and understanding the context and details for creating a binary urban-rural classification by re-classifying the SACType or POP_CNTR_RA_TYPE variables. However, I can provide some feedback in the form of a small set of cursory assumptions (or possible cautionary notes) that may help the DLI client further her choice/research/application(s).
Based on the logic provided, the DLI client is re-classifying some of the 2011 Census geography variables. The following points are my assumptions.
- The latest vintage of the PCCF is the source of the SACType and POP_CNTR_RA_type variables/data.
- Feedback has been (or will be) provided on the nature and limitations of the PCCF with regards to postal geographies (FSA and FSA-LDU). This would include details regarding the Canada Post functional urban/rural indicator expressed in the FSA itself.
For Option A:
- The client is not interested in distinguishing between POP_CNTRs inside CMAs and CAs and POP_CNTRs outside CMAs and CAs (where POP_CNTR_RA_type recoded (1-4, 6 = 1)). The residual area inside and outside CMAs and CAs is defined as “rural”.
For option B:
- Areas outside CMAs and CAs that are classified as strong and moderate metropolitan influence (MIZ) are aggregated/grouped with CMAs and CAs to form a single “urban” class (where SACType is recoded (1-5 = 1)).
Note: there is no concern that rural areas (StatCan’s rural) exist in most CMAs, CAs, and areas classified as strong and moderate MIZ.
- Areas classified within the Territories and weak and no metropolitan influence (MIZ) are aggregated/grouped to form a single “rural” class (where SACType is recoded (6-8 = 0)).
Note: there is no concern that the StatCan defined POP_CNTRs exist is most areas classified as moderate and no MIZ, and the Territories.
I can sense that option A may be more appropriate – again, without some context it is challenging to provide meaningful advice.
Please let me know if more information is needed or if there are any details regarding the context of the research/application that have not been shared with me.