Wednesday, November 29, 2006

March 2006 Postal Code Conversion File

Questions

I downloaded the March 2006 PCCF from the DLI directory, and have some questions.

In the directory structure, there is a "corrected-postal-codes" subdirectory, with two files:
1) pc_corr_cp.xls - which is a list of 550 postal codes --- Is this the list of postal code corrections mentioned in the second bullet on page 5 of the codebook? Are we supposed to do something with this file, and if so, what? Is there a list of the corrections made that could advise our users, or do we simply tell them "these postal codes got fixed somehow - we don't know what was wrong, or what was corrected"?

2) problemsdpl_problemesld.txt - which has 45 lines of data, each line having a "PCODE", "DPL", and "Incorrect_DPL"
--- Is this the file of "unnecessary Designated Place - postal code linkages" mentioned in section 4.4 of the codebook?

When I look at the records in the data file, it appears that if we make the changes recorded in problemsdpl_problemesld.txt, we will have identical duplicate records for the postal codes - so I assume we are supposed to delete the records with the incorrect DPL?

I ran the duplicate checking procedures I created against the non-retired SLI=1 postal codes (theoretically the "best match" file), and came up with 48 records that were duplicated. My list of 48 postal codes includes all 45 of the ones listed in the file problemsdpl_problemesld.txt, plus postal codes V9B0A4 (DPLs 0010 and 9959), V9B6X4 (DPLs 0010 and 9959), and V9B6X5 (DPLs 0010 and 9959).

What about these 3 postal codes that seem to be duplicated - which is the "wrong" DPL (and, if I'm right in assuming, a candidate for deletion)?

In the short term, a "readme" file in the corrected-postal-codes directory would be quite useful, I think, to instruct us in the use of these two files.

3) Why isn't it possible to get this file corrected by Statistics Canada, either at the source division or at DLI, so that each DLI institituiton doesn't (or shouldn't) have to make the same corrections?

Answers

1) This file is explained on page 5 as you mentioned, it is a list of postal codes that were linked to incorrect geographic units but have now been corrected. You do nothing with this file, these corrections are in the Postal Code Conversion File (PCCF) .

2) Yes, there were some duplicate DPL linkages created during the automated geocoding process and we have discovered a few more which include the postal codes you have mentioned below.

3) - EAC) The issue with the DLPs results from the automated geocoding process. The Postal Code Project Team is working on redesigning the geocoding system in order to reduce errors and increase data quality of the Postal Code Conversion File (PCCF). The January 30, 2007 PCCF release will contain the same DPL duplicates because we are not switching over to the new geocoding system until after this last release (based on 2001 geographic units). Once we release the first PCCF based on 2006 geographic units the DPL duplicates should no longer exist because we will be using the new geocoding system.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.