Friday, March 24, 2017

2006 census data issues for DAs in Quebec

QuestionIs there someone in the Census division that could help with a bit of an anomaly for the 2006 Census? We are finding DAs from 2006 that show higher counts for certain variables that the population totals for that DA. We’ve confirmed this exists for both StatCan Beyond 20/20 files as well as the data ingested into CHASS Census Analyzer at UofT.

This is STC Beyond 20/20:

25=Total Population 25-64 years by highest certificate, diploma or degree

30=Certificate, diploma or degree



Same thing for this DA in Census Analyzer:




Thanks for any answers to this puzzle.

Answer
I'm interested in seeing an answer from Stats Can, but I'm wondering if this might be a quirk of the random rounding that happens with the census, made more apparent by the small population numbers? This page says that "totals and sub-totals are independently rounded". And the random rounding frequency on this page gives us the odds of a number being rounded up or down.

So I guess that technically, there could be 29 people aged 25-64 in that DA, and 1 times out of 5, the population total will be rounded to "5" giving us 25. And then, let's say that 26 of these have "Certificate, diploma or degree", 1 times out of 5 that gets randomly rounded to 30. Unless I'm misunderstanding how random rounding is applied, which is very possible!

Answer(2)
Here is the response we’ve received from subject matter:

“This anomaly can be chalked up to random rounding. As per the following:
Random rounding

All counts in census tabulations are subjected to random rounding. Random rounding transforms all raw counts to random rounded counts. This reduces the possibility of identifying individuals within the tabulations.

All counts are rounded to a base of 5, meaning they will end in either 0 or 5. The random rounding algorithm employed controls the results and rounds the unit value of the count according to a predetermined frequency. Table below shows those frequencies. Note that counts ending in 0 or 5 are not changed and remain as 0 or 5.

Random rounding frequency
Unit values of
Will round to count ending in 0
Will round to count ending in 5
1
4 times out of 5
1 time out of 5
2
3 times out of 5
2 times out of 5
3
2 times out of 5
3 times out of 5
4
1 time out of 5
4 times out of 5
5
Never
Always
6
1 time out of 5
4 times out of 5
7
2 times out of 5
3 times out of 5
8
3 times out of 5
2 times out of 5
9
4 times out of 5
1 time out of 5
0
Always
Never



The random rounding algorithm uses a random seed value to initiate the rounding pattern for tables. In these routines, the method used to seed the pattern can result in the same count in the same table being rounded up in one execution and rounded down in the next.”