Tuesday, April 29, 2014

2013 PCCF Records Missing DAUID

A user has reported and I have verified that there are quite a few records in the current PCCF (pccfNat_JUN13_fccpNat.txt) which have no DAUID assigned them. To be specific, there are 6,463 postal codes on the file missing DAUID; this number is reduced to 5,120 if the Single Link Indicator (best record match) is applied.

Are these the records that are referred to in the documentation (92-154-g2013001-eng.pdf , What’s new), with the following note? “A small number of new postal codesOM are linked to a census subdivision only. These new postal codesOM do not yet link to Statistics Canada's geographic frame. Linkage below the census subdivision will appear on these records when the street and address information becomes available on the geographic frame. This new linkage will appear on subsequent releases of the PCCF.”

I note as well, having looked at these records, that other fields are also assigned a code of 0 as a consequence of being identified at CSD only: FED03UID is not assigned if the CSD has multiple Federal Electoral Districts. I could quibble a bit with the characterization of this missing information as a “small number” – 5,120 is a fairly large number, but in context is just over 0.6% of all of the best match records in the file. I had thought of overlaying latitude/longitude onto DA boundary files, but it appears that all such records have the centroid of the CSD assigned to these fields.

He is concerned with the accuracy of his analysis as a consequence of missing this information: I don’t know how many of his observations fall into these unidentified postal codes. When will the next version of PCCF (with the updated geographic frame) be released?


I confirmed with subject matter that they will not know when the next release of the PCCF will be until end of Summer, at which time I will follow up for more details.

In regards to the updated geographic frame - although street and address information can come from Canada Post, we cannot do anything with this information unless we have matching information on our geographic frame to link to. Our geographic frame is based on the Spatial Data Infrastructure (SDI) which in turn is based on theNational Geographic Database (NGD).

A) The National Geographic Database

The National Geographic Database (NGD) is a shared database between Statistics Canada and Elections Canada. The database contains roads, road names and address ranges. It also includes separate reference layers containing physical and cultural features, such as hydrography and hydrographic names, railroads and power transmission lines. Priorities for road network file maintenance are determined by Statistics Canada and Elections Canada, enabling the NGD to meet the joint operational needs of both agencies in support of census and electoral activities.

The main sources for the NGD include:

· Statistics Canada's street network files
· Elections Canada's road network file
· National Topographic Database (NTDB) digital coverage at 1:50,000 and 1:250,000 from Natural Resources Canada, and Digital Chart of the World (DCW) coverage at 1:1,000,000
· provincially-sourced data sets
· other information from field operation activities, municipal maps and private sector licenced holdings.

More information on the NGD can be found in the Census Dictionary,<http://www12.statcan .gc.ca/census-recensement/2011/ref/dict/geo015-eng.cfm>.

B) Spatial Data Infrastructure (SDI)

The Spatial Data Infrastructure (SDI) is an internal maintenance database that is not disseminated outside of Statistics Canada. It contains roads, road names and address ranges from the National Geographic Database (NGD), as well as boundary arcs of standard geographic areas that do not follow roads, all in one integrated line layer. The database also includes a related polygon layer consisting of basic blocks (BB),1 boundary layers of standard geographic areas, and derived attribute tables, as well as reference layers containing physical and cultural features (such as hydrography, railroads and power transmission lines) from the NGD.

The SDI supports a wide range of census operations, such as the maintenance and delineation of the boundaries of standard geographic areas (including the automated delineation of dissemination blocks and population centres) and geocoding. The SDI is also the source for generating many geography products for the 2011 Census, such as cartographic boundary files and road network files.

More information on the SDI can be found in the Census Dictionary, <http://www12.statcan. gc.ca/census-recensement/2011/ref/dict/geo020-eng.cfm>

Detailed information on the process for Linking to 2011 Census geographic areas can be found in the PCCF reference guide – Data Quality section,<http://www.statcan.gc.ca/pub/ 92-154-g/2013001/qual-eng.htm>