Friday, August 7, 2015

CANSIM Population Estimate Tables


A graduate student researcher wants to merge two CANSIM population estimate tables, 109-5335 ( and 109-5325 ( together. His initial question was a little confusing but, after speaking with him, this is his clearer rephrasing of what he sees as a problem with the tables:

My thoughts in terms of accuracy are that the more recent 109-5335 table is more accurate than the 109-5325 in the estimates after 2006 (especially those further from that year) given that census data should have been available to benchmark the 2011 year in the former table, where the population estimates for those years would be extrapolated (with help from admin data) in the later table. In comparing the tables, it would seem that it is the post-censal estimates which show the greatest levels of disparity between the tables. Although intercensal populations are also estimates, I anticipate these will be more accurate vs projections given solid benchmarks as to where the true population values were and where they should lead to. Therefore, I see this as an accuracy concern and not a methodological concern.

If you could raise the question to Stats Canada, there may be issues of methodologies that provide a better understanding of the tables. I am also wondering why multiple census benchmarks have (seemingly) not been used in making the tables. I think these should have included 1996, 2001, 2006, and 2011. By the description of interensal periods, it looks like 109-5325 uses 1996 estimates (skipping 2001) and 109-5335 uses 2001 estimates (skipping 2006) but I might be wrong.

I do not think that merging these two tables is a good idea and that is what I told him. But if anyone had any additional information on the methodologies used for these two separate population estimate calculation, or the census benchmarks used in the two tables, other than what is provided on the CANSIM webpage, it would be much appreciated.


The subject matter division provided the following response:

First of all, the best recommendation would probably be to use CANSIM table 109-5345, which replaces CANSIM table 109-5335, as indicated in Footnote #1. This table has the most up to date and accurate population estimates for Health Regions over the 2001 to 2014 period.

Concerning the user’s concerns on accuracy, you can tell him that population estimates that are of intercensal level are generally more accurate (i.e. closer to reality) as opposed to postcensal level estimates. This is because intercensal estimates are based on the two censuses preceding and following the year in question, whereas postcensal estimates are based from the most recent available census (to which is added the estimated demographic growth, based on various administrative data sources). Contrary to the user assumptions, we are always using all census counts (adjusted for census net under coverage and incompletely enumerated Indian reserves) that are available at the moment of producing the data. I’m not sure what exactly led him to his assumptions; if he still has questions regarding methodology for producing population estimates, you could provide him my contact info and I’ll gladly help him.