Thursday, February 21, 2019

2006 Census PUMF

I was going through the user guide for the Census and I was a bit unclear whether I could calculate 95% confidence intervals (e.g. poisson approximation or gamma-distribution) on the age-standardized weighted proportions (of visible minority groups) simply through my direct age-standardization algorithm or if there are other considerations (e.g. variables WT1 through WT8).

We received the following response from subject matter:

Poisson, gamma or other distributional assumptions are typically used for model based inference from data that is not sourced from surveys. It assumes that the event of interest is a random process following the hypothesized distribution and that your data is one realization from this random process. This assumption gives you a tool to derive a model based variance and confidence intervals. It is preferable to use a design-based method in the context of complex survey design data where the characteristic(event) is fixed for all units in the population of interest but the random error between the sample estimate and the true population parameter is due to the random sampling.

For standardized estimates from complex surveys, you need to account for the survey design information in the estimation of the variance. For the 2006 PUMF data file, the steps to do so are described in Chapter 3, section C.2 "Estimation of the sampling variability".

Specifically for the 2006 Census PUMF, you would compute 8 age-standardized estimates, once using each weight variable (step 1) and continue through to step 6.

The method depicted there provides you with 95% Wald confidence interval. It's the method most commonly used for convenience but it rests upon some assumptions, namely that the distribution of the estimator is normal which tends to be inappropriate for very small proportions on smaller domains of estimation. I assume it is less likely to affect you since standardization generally commends larger estimation domains.