Tuesday, December 19, 2006

Participation and Activity Limitation Survey Question Re: Children

Question

A UBC hospital researcher is planning to use PALS, but her interest is in the children surveyed, not the adult population. This presents a problem because the PUMF in the DLI collection does not include children 5-14. To quote the User's Guide:

For purposes of the PUMF, analysts based themselves on the 28,908 persons aged 15 and over who responded to the survey.

The researcher has not been able to contact the person identified in the release in The Daily or that person's successor.

She would really like to know whether there will be a PUMF for the children's file. She has been in contact with the local Research Data Centre and will work there if necessary, but wants an answer as to the eventual availability of a children's PUMF.

Answer

There is no PUMF for the PALS children's file.

Some results are available in "A profile of disability in Canada, 2001", catalogue number 89-577-XIE. The Profile of disability among children is available at the following link:
http://www.statcan.ca/english/freepub/89-577-XIE/children.htm

More than likely, the researcher may need additional information not covered in the previous profile. The researcher's options are:

1) Access the PALS file via the RDCs
2) Order custom tabulations

Index to Senate Debates

Question

Does the Parliamentary website have an index to the Senate debates? If so, where is it tucked away?

Answer

Here is the link to it from the DSP:
http://dsp-psd.communication.gc.ca/Collection-R/Senate/index-e.html

Amount of unpaid work at retirement

Question

We have a student looking for how many years of unpaid work that people have when they retire and what the reason for this unpaid work is.

Answer

Two possibilities:

1) We measure the amount of unpaid work through the Census.
2) The GSS Time Use cycle has some goodies in terms of unpaid work.

These are cross-sectional data though and not cumulative over a period of a lifetime. I am not familiar with any source of data that provides the number of unpaid work years people have when they retire.

Thursday, December 14, 2006

Attitudes toward homosexuality

Question

A couple of students here have worked with U.S. General Social Survey data to look at attitudes toward homosexuality. There are a few variables in that survey that get at respondents' attitudes:

HOMOCHNG HOMOSEXUALITY: INHERENT OR CHOICE?
HOMOSEX HOMOSEXUAL SEX RELATIONS
HOMOSEX1 IS HOMOSEXUAL SEX WRONG?

They are hoping to continue this work using Canadian data, but I haven't been able to find any like variables in our GSS cycles or other surveys available in the DLI -- there are surveys dealing with sex and with whether sexual orientation is a factor in victimization, but I can't find any that measure the attitudes of the respondents themselves. Does anyone know if Statistics Canada collects, or has collected, this kind of information? The students want to relate attitudes to religion and ethnicity.

Answers

1) The World Values Survey has questions that get at this, and are comparable across a number of countries. The 2000 wave includes questions:

v76 People the respondent would not like to have as neighbours: homosexuals
v208 Behaviour ever justifiable: homosexuality

The data is available from http://www.worldvaluessurvey.org/, as well as ICPSR.

2) There's also the Charter of Rights and Freedoms data collected by CRIC. These deal with gay marriage, equality rights and
discrimination. In addition, there are some Globe and Mail polls that might be helpful, also from CRIC.

Queen's has a huge number of public opinion polls in their CORA archive. Go to:
http://www.queensu.ca/cora/

3) Access to the first three waves of the World values surveys (1981-1995) is available unrestricted, via an SDA interface at:
http://nds.umdl.umich.edu/cgi/s/sda/hsda?harcWEVS+wevs

Limited online analysis of all 4 waves is available at the WVS web site:
http://www.worldvaluessurvey.org/services/index.html

It looks as if you can obtain all the data there as well.

Wednesday, December 13, 2006

Updated Products - PSES / CIES / AWP

Public Service Employee Opinion Survey - 2005

Data is available on the opinions of employees in their work environment, job satisfaction, career movement, equipment needs and special needs. Tabulations are available at the department level or at the Public Service level.

FTP: /dli/pses/2005
WEB: http://www.statcan.ca/english/Dli/Data/Ftp/pses/pses2005.htm

****
Changes in Employment Survey - Cohorts 1 to 10 (CIES)

The primary objective of the Changes in Employment Survey (CIES) is to evaluate the impact of Bill C-12 on the Employment Insurance legislation and the degree to which program objectives have been achieved. Bill C-12 was introduced into legislation in part in July 1996, with the remainder coming into effect in January 1997.

The legislation was designed to better reward work effort, to ensure adequate benefits by targeting those most in need, to encourage job creation, and to improve the perception of fairness. Specific aspects of these objectives were addressed in the survey. In addition, the survey attempts to get a measure of the aggregate impact of the legislation.

Secondary objectives of the survey include the continuation of the information collected in the 1993 and 1995 Canadian Out-of-Employment Panel Surveys. This includes collection of background demographics on the individual and the household, as well as information on job search activities and outcomes, assets and debts, expenditures, and utilization of Employment Insurance and Social Assistance.

Microdata files for the Changes in Employment Survey (CIES) are now available for five reference periods. They are: Cohorts 1 and 2: January 1995 to September 1996; Cohorts 3 and 4: July 1995 to February 1997; Cohorts 5 and 6: January 1996 to September 1997; Cohorts 7 and 8: July 1996 to February 1998; and Cohorts 9 and 10: January 1997 to September 1998.

FTP: /dli/cies
WEB: http://www.statcan.ca/english/Dli/Data/Ftp/cies/cies-cohorts1-10.htm

****
Survey of Annual Work Patterns 1978 - 1985

FTP: /dli/awp
WEB: http://www.statcan.ca/english/Dli/Data/Ftp/awp.htm

Monday, December 11, 2006

Income Variables in CCHS 2.1

Question

There's a curious difference between two income variables in CCHS 2.1; according to the codebook, for the total household income
variable

INCCGHH=1 represents 'NO INCOME AND LESS THAN $15,000',

whereas for the personal income variable

INCCGPER=1 represents 'NO INCOME' and
INCCGPER=2 represents 'LESS THAN $15,000'

A student currently using the PUMF was wondering why one had the two categories collapsed into one, but the other, not.

Answer

The reason provided by the division was indeed for confidentiality reasons.

Friday, December 8, 2006

Seafood Consumption by Ethnic Group

Question

Can someone help me to find data on fish and/or seafood consumption (in Canada) by ethnic groups?

Answer

The Family food expenditure surveys 1992 and 1996 - summary files don't have much, but they do have:
- country of birth (more like continent of birth) and mother tongue (English, French, and other)
- weekly food expenditure on fish and other marine products

Unfortunately, the country of birth variable is no longer included in the 2001 pumf.

There is also the pdf file:
Influence of Immigration on The Ethnic Food Market in Canada / Agriculture and Agri-Food Canada in the November 2005 edition of Canada food stats (23F0001XCB)

I checked the Canada health survey (1978-79) and the Nutrition Canada survey (1970-1972) without success - but then, these are quite old by now and there have been substantial changes to the ethnic composition of Canada since then.

You may need to send the user the RDC-route, since the CCHS 2.2 nutrition component should include other foods than just the fruits and vegetables that are in the pumf, as well as ethnicity variables in the common and optional content file.

Monday, December 4, 2006

Questions about the General Social Survey 2005, main file

Question
The weight variable for variable EPI677 is not stated - Would it be WGHT_PER?

Answer
Yes

Question
The Coverage is not stated for variables TCS_Q180 to TCS_Q200, VCG_Q300 - In all cases would it be all respondents?

Answer
Yes

Question
Is a weight variable supposed to be associated with variable INCMMEMC? None is defined. WGHT_PER is used for INCM, but WGHT_HSD might be used - which weight variable (if any) should be assigned?

Answer
Use WGHT_PER

Question
What is definition of WGHT_CSP and WGHT_SNT?

Answer
WGHT_CSP: This is the weighting factor for analysis at the person level created using the sample of persons asked the questions in Section 10A - culture, sports participation and physical activity. For example, to estimate the number of persons who used library services as a leisure activity in the last 12 months, WGHT_CSP should be summed over all records with this characteristic. This weight is zero for respondents who were not asked this section i.e. completed Sections 10B and 11. - Used for participation in Culture activities.

WGHT_SNT: This is the weighting factor for analysis at the person level created using the sample of persons asked the questions in Sections 10B and 11. To estimate the number of persons with a particular characteristic, WGHT_SNT should be summed over all records with this characteristic. This weight is zero for respondents who completed Section 10A. - Used for participation in Sport activities.

Wednesday, November 29, 2006

March 2006 Postal Code Conversion File

Questions

I downloaded the March 2006 PCCF from the DLI directory, and have some questions.

In the directory structure, there is a "corrected-postal-codes" subdirectory, with two files:
1) pc_corr_cp.xls - which is a list of 550 postal codes --- Is this the list of postal code corrections mentioned in the second bullet on page 5 of the codebook? Are we supposed to do something with this file, and if so, what? Is there a list of the corrections made that could advise our users, or do we simply tell them "these postal codes got fixed somehow - we don't know what was wrong, or what was corrected"?

2) problemsdpl_problemesld.txt - which has 45 lines of data, each line having a "PCODE", "DPL", and "Incorrect_DPL"
--- Is this the file of "unnecessary Designated Place - postal code linkages" mentioned in section 4.4 of the codebook?

When I look at the records in the data file, it appears that if we make the changes recorded in problemsdpl_problemesld.txt, we will have identical duplicate records for the postal codes - so I assume we are supposed to delete the records with the incorrect DPL?

I ran the duplicate checking procedures I created against the non-retired SLI=1 postal codes (theoretically the "best match" file), and came up with 48 records that were duplicated. My list of 48 postal codes includes all 45 of the ones listed in the file problemsdpl_problemesld.txt, plus postal codes V9B0A4 (DPLs 0010 and 9959), V9B6X4 (DPLs 0010 and 9959), and V9B6X5 (DPLs 0010 and 9959).

What about these 3 postal codes that seem to be duplicated - which is the "wrong" DPL (and, if I'm right in assuming, a candidate for deletion)?

In the short term, a "readme" file in the corrected-postal-codes directory would be quite useful, I think, to instruct us in the use of these two files.

3) Why isn't it possible to get this file corrected by Statistics Canada, either at the source division or at DLI, so that each DLI institituiton doesn't (or shouldn't) have to make the same corrections?

Answers

1) This file is explained on page 5 as you mentioned, it is a list of postal codes that were linked to incorrect geographic units but have now been corrected. You do nothing with this file, these corrections are in the Postal Code Conversion File (PCCF) .

2) Yes, there were some duplicate DPL linkages created during the automated geocoding process and we have discovered a few more which include the postal codes you have mentioned below.

3) - EAC) The issue with the DLPs results from the automated geocoding process. The Postal Code Project Team is working on redesigning the geocoding system in order to reduce errors and increase data quality of the Postal Code Conversion File (PCCF). The January 30, 2007 PCCF release will contain the same DPL duplicates because we are not switching over to the new geocoding system until after this last release (based on 2001 geographic units). Once we release the first PCCF based on 2006 geographic units the DPL duplicates should no longer exist because we will be using the new geocoding system.

APS 2001 Off-Reserve Data and Chi-Square

Question

I have a question regarding the use of the Chi-square test, this time when using the Aboriginal Peoples Survey.

The researcher notes that "One way to correct the Chi-square for a complex sample is to divide the Chi-square produced, say by SPSS, by the design effect for the sample being used. The result is only approximate, but it works well enough for sorting out what are likely significant relationships."

The answer received for the previous question suggested how to do this for the 2001 Census PUMF of Individuals. Now, he is using the APS 2001 Off-Reserve data set but cannot find a table of design effects, (or conversion factors, or quality factors or anything that looks like values that they can turn into design effect values) in the documentation tha he has for this complex sample. Has he missed something or might there be missing documentation? Do you know where this information might be, or there some other way to do the Chi-square test using the APS?

Answer

I am assuming that the researcher is referring to the PUMF, which does not contain the bootstrap weights and so it is not possible to do this kind of test using that dataset.

My best suggestion would be for the person to get access to the analytical file(s) in the RDC. Failing that, it might be possible to do it through the custom requests route.

Tuesday, November 28, 2006

Provincial Expenditures on Social Services and Health

Question

A political science graduate student is looking for historical data (back to the 1970s) for provincial governments expenditures in social services and health. CANSIM Table 385-0008 provides data from 1989 onward and Historical Statistics of Canada (H332-344. Provincial governments, gross general expenditure by function) presents data from 1965 to 1975.

Where can we find statistics for the missing period, 1976 to 1988?

Answer

Time series data prior to 1989 were removed from Statistics Canada's CANSIM database because the statistical concepts and universe coverage employed when these terminated data were produced, differed significantly from SNA 93 international guidelines. SNA 93 guidelines were implemented during the 1997 Canadian System of National Accounts (CSNA) Historical Revision and the data which Public Institutions Division currently produces and supports covering the 1989 to present period, subscribe to these new guidelines.

Unfortunately, data prior to 1989 are unsupportable and unavailable.

Monday, November 27, 2006

Updated Products - GSS 19

General Social Survey, Cycle 19: Time Use (2005)

The core content of time use repeats that of cycle 12 (1998), cycle 7 (1992) and cycle 2 (1986), and provides data on the daily activities of Canadians. Question modules were also included on unpaid work activities, cultural activities, social networks and participation in sports. The target population of the General Social Survey consisted of all individuals aged 15 and over living in a private household in one of the ten provinces.

FTP: /ftp/gss/cycle19-2005

Friday, November 24, 2006

Birth and Death Rates for Sackville, N.S.

Question

I have a student looking for birth and death rates for Sackville, Nova Scotia. This is a "Designated Place" according to the Census people. I can't seem to get to birth and death rates for anything other than provinces when using CANSIM. Sackville, N.S. is part
of the Halifax Regional Municipality and is served by the Cobequid Community Health Board. She can't seem to get information from HRM or the Health Board.

Any and all suggestions are welcome. I suspect it is another one of the situations where the raw data is collected, but no calculating is done or at least published on this level of geography.

Answers

1. Birth and death statistics were included in the 1996 Community Profiles. If you go to the STC website and select "Community profiles" from the left sidebar menu, you will find a link to the 1996 Community profiles in the bottom right of this page. A search for Sackville, NS results in two choices: both Subdivision C for Halifax. Taking this link will allow you to pick "Births and Deaths" from the following page.

2. Another alternative is to use Annual demographic statistics, which contains births, deaths, and population by county (=census division). From births/deaths and population, one can compute birth and death rates. Of course, It requires assuming that the birth/death rates for the county in also hold for Sackville, but it might be faster than waiting for birth/death stats to show up in the 2001 Community profiles.

3. Birth and death rates are available by county in the Nova Scotia Vital Statistics Annual Report
(http://www.gov.ns.ca/snsmr/vstat/annualreports/) .
Totals are also given for all incorporated areas (over 1000) but Sackville is not one of them.

Mining FDI in Latin America

Question

A researcher here would like to get Canadian mining foreign direct investment into Latin American countries. The following CANSIM table comes closest:

Table 376-0053
International investment position, Canadian direct investment abroad and foreign direct investment in Canada, by industry and country, annual (dollars)

The closest industry is "energy and metallic minerals", but the real problem is the level of geographic detail; the only categories are US, UK, other EU countries, Japan, OECD, Japan and OECD countries, and the wonderfully evocative "all other foreign countries". Is it possible to get more detail from Statistics Canada? Alternatively, can anyone suggest a source that might provide a little more detail, especially for South America?

Answer

One could try the NRCan? (http://www.nrcan.gc.ca/mms/hm_e.htm).
They have statistics, a form to request information, and a link to industry associations (http://www.nrcan.gc.ca/mms/lien/mac_e.htm).

The Mining Association of Canada puts out Facts and Figures
(http://www.mining.ca/www/media_lib/MAC_Documents/Publications/
English/2006_FF_Eng.pdf
)
which does mention investments in general - they may have more detailed information if you contact them directly.

SLID 2000 Immigrant Variable

Question

In the 2000 Survey of Labour and Income Dynamics there is a variable "immst15" (immigrant flag). Marginals are as follows:

1 yes 3,641
2 no 11,608
7 don't know 42,192

We're assuming that the "don't know" category should actually be "not applicable" and also that it refers to persons born in Canada. Is that a valid assumption?

Answer

Immigration status is available only for persons living in urban size of 500,000 and higher all other individuals are set to don't know for confidentiality reasons.

Follow up

Is there not some way that another code could be used that could be specifically for "confidentiality protection"? A code of "don't know" should be reserved for persons who indicate that as their response. This undermines confidence in coding.

Wednesday, November 22, 2006

LFS (Labour Force Survey) 2006 and SPSS

Question

Is it safe to assume that the SPSS syntax file
"lfs-rebased2001-2000-2005.sps"
can be used for 2006 monthly files as well?

Answer

Everything seems fine to use the same code for 2006.

Environmental and demographic data

Question

One of our PhD's is looking to find data at the lowest geographic level possible on as wide a variety of environmental indicators (air, water and soil qualities, climate, etc) and wants to marry that with a wide range of demographics (age, gender, ethnicity, income, etc) at the same level of geography. He'd like the data from the 1990's to the present.

We've looked at several sources but can't really find anything. The Canadian Environmental Sustainability Indicators provides data back to the 1990s but only at the national and/or provincial level.

Does anyone out there know whether these data would be available at lower levels of geography? He says he's heard of data collected at either the postal code or FSA area, but we haven't been able to locate them.

Answers

1. He might take a look at the paper by Buzelli et al. (2006) in Canadian Geographer 50(3): 376-391 for a method of estimating/interpolating such environmental data at the Census Tract scale in Vancouver.

2. Have you looked at the National Land and Water Information Service for some of the environmental data? The main data page is:
http://sis2.agr.gc.ca/cansis/

You might start with the Land Potential Database. It contains "data about soil, climate, physiography, land use, modelled constraint free (potential) crop yields, actual crop yields and soil degradation for all regions of Canada".
http://sis.agr.gc.ca/cansis/nsdb/lpdb/intro.html

3. Count Statistics Canada out for the data at that level of geography - and even if data was collected at a low level of geography,
it does not mean that it is reliable enough to publish at that level of geography. The counts of respondents plays a very important role in what level of geography is released (census vs. survey).

4. As an added piece of information, the Health Indicators project is working with Environment Account and Statistics Division in order to come up with indicators related to the environment. The major challenges are the level of geography and the link to the standard geography. When you talk about air, water and other environmental indicators, the standard boundaries are not always relevant. This is a challenge that needs to be addressed and in maybe two or three years, it may be possible to see some relevant data being released but not yet.

5. It depends whether the client wants to use an environmental boundary or census boundary. Agriculture Canada will soon be releasing Ag data at the sub-sub-basin level and soil landscape level. But most data are not available at this level.

Also, it depends on the type of data he is looking for. Waste management data, for instance, are only available at the provincial
level due to confidentiality.

6. A few more things your patron may want to look at:

Air quality/pollutants:
http://www.ec.gc.ca/pdb/npri/npri_preinfo_e.cfm#dbase

Canadian compendium on common air pollutants:
http://geodiscover.cgdi.ca/gdp/search?action=
entrySummary&entryType=productCollection&entryId=14393&entryLang=en


General ecological:
http://geogratis.cgdi.gc.ca/Ecosystem/ecosystem.html

For climate:
http://www.cccma.bc.ec.gc.ca/hccd/
http://climate.weatheroffice.ec.gc.ca/
http://www.etcentre.org/naps/naps_data_e.html

Friday, November 17, 2006

Workplace and Employment Survey (WES) availabilty

Question

I have a professor looking for the 2003 employee data PUMF for the WES. I see that synthetic files are available through the FTP site - are the PUMFs available?

Answer

There is no PUMF for this survey at this time in the DLI collection. I believe we were offered a version of a PUMF, but it was not overly useful and needed to be returned to the division - I don't recall any of the details at this time.

The synthetic files seem to be your only option at this exact time, but there may be some data from the survey in the DLI collection in the near future.

Wednesday, November 15, 2006

Canadian Community Health Survey 1.1 Question

Question

I have been using the CCHS data file (cycle 1.1) PUMF and the dictionary for the PUMF. While the dictionary often lists "don't know," "refusal," "not stated" categories the actual PUMF file only lists missing (never not stated, etc.) Is it okay to assume that all those missing are either: not applicable, don't know, refusal or not stated?

Answers

1. The CCHS 1.1 PUMF does include the range of values for instances of missing information: not applicable, don't know, refusal or not stated. We have a version of this file in SPSS that includes these values and that declares them individually as missing.

Is your patron working with SPSS or SAS? I'm guessing that she or he is using SAS. Why SAS? Because of the way users typically assign missing data in this statistical system. For example, the variable CCCA_131 records whether the respondent has cancer. The values and labels for this variable are:

1 = YES
2 = NO
7 = DON'T KNOW
8 = REFUSAL
9 = NOT STATED

I have seen many researchers working in SAS assign missing data
using the following statement in a DATA step:

if ccca_131>2 then ccca_131=.; which treats the values 7, 8 and 9 as one missing category.

SAS does allow specifying 27 special missing values by using a decimal point followed by a letter or the underscore character. Therefore, a researcher could declare each of the missing values for CCCA_131 in three statements:

if ccca_131=7 then ccca_131=.A;
if ccca_131=8 then ccca_131=.B;
if ccca_131=9 then ccca_131=.C;

All three of these values would be treated as missing but SAS would differentiate between the three types of missing information. I haven't seen many researchers take the time to write this much code for all of the variables they are using, though.

2. The most recent versions of Stata includes support for multiple missing values, but earlier versions (before Stata) did not. So if your user converted the file using a program such as DBMSCopy, which converts to an older Stata format, then all of the different SPSS missing values would have been mapped onto a single Stata missing value. This might also happen if she coverted the file to the current Stata version, but used a program to convert that didn't deal well with either SPSS or Stata missing cvalues.

One solution might be to open the file in SPSS, undeclare the missing values, then do the conversion. After this was done the user could then declare the Stata missing values from within Stata.

3. Stat-Transfer, another data conversion program claims to handle multiple missing values. On the first options tab, there's a set of options for handling user-declared missing values (such as SPSS has), one of which is to convert them into Stata extended missing values.

However, I ran in to problems trying a simple transfer from SPSS using a dataset I had on hand - it took three different missing values on one variable, and returned a variable with only two - it combined two of the different values into a single Stata missing value. I spoke with a colleague who say that he's had the same problem.

Another option for handling user-declared missing values in StatTransfer is to "Use none" - this option simply preserves the original values of the variable. On the quick test I ran, this actually worked, and it's quicker than opening the file in SPSS and redeclaring all the missing values separately.

Friday, November 10, 2006

Amount collected by restaurant workers in tips

Question

How much is collected in income tax for tips by restaurant / bar workers and how much do they actually collect from the folks leaving the tips?

Answer

That is a very tough question! The last time Statistics Canada looked at "The Underground Economy" was in 1994...

I did find a handy paragraph that could perhaps be of help:

3.5 Tips

Tips are calculated in the national accounts as a fixed percentage of gross business receipts, varying by industry and type of service provided (3% for accommodation, 10% for meals in restaurants, alcoholic beverages and hairdressing and 15% for taxi). The upper limit of tips missing from GDP due to underground transactions can be calculated directly by applying the same percentages to the estimates skimming of receipts...

Source: The Size of the Underground Economy in Canada, catalogue 13-602E, No. 2, Statistics Canada, p.30. (Gaëtan - don't look at my citation format!)

I would take some time to review the entire document - it has many components which are very helpful.

Although this will not meet your exact needs, it may hold some information to help guide you in your search.

Wednesday, November 8, 2006

Expenditures on antidepressants

Question

Is there any data available on how much Canadians spend on antidepressants in a year?

Answers

Answer 1:

I am not sure Statistics Canada is the best source for this sort of data... I did perform a few searches, but the results are far from
impressive.

Result # 1)

CANSIM Table 203-0008 -- Survey of household spending (SHS), household spending on health care, by province and territory, annual

One section is called: Medicinal and pharmaceutical products (Prescription medicines _and_ Other non-prescription medicines and pharmaceutical products).

As you can see, this is a very broad category and may be a poor indication of antidepressant expenditure.

Result # 2)

CANSIM Table 301-0006 - Principal statistics for manufacturing industries, by North American Industry Classification System (NAICS), annual (dollars unless otherwise noted)

Pharmaceutical and medicine manufacturing [3254]
Pharmaceutical and medicine manufacturing [32541]
Pharmaceutical and medicine manufacturing [325410]

Although you can get revenues from this table, extreme caution should be used.

1) This is for a broad category of pharmeceuticals and medicine - antidepressants is only one part
2) The manufacture's price may not be the retail price and could under estimate the revenues/sales.
3) Some of the product could be exported and would not be reflective of the Canadian consumption model.
4) There may be additional issues with this proposed option....

Answer 2:

CIHI has a Drug Spending Database:
http://secure.cihi.ca/cihiweb/dispPage.jsp?cw_page=spend_e#drug
Who is authorized to access?

Drug Expenditure in Canada: including very aggregate data.
http://secure.cihi.ca/cihiweb/dispPage.jsp?cw_page=AR_80_E
Drug expenditure data in this report are obtained from the National Health Expenditure Database (NHEX) maintained by the Canadian Institute for Health Information (CIHI). Drug expenditure data in NHEX are macro-level data and do not allow for decomposition of prescription costs or drug classes.

For a general overview:
http://secure.cihi.ca/cihiweb/products/Drug_Expenditure_in_Canada_2006_final_web.pdf

Answer 3:

STC does not have that detailed information on antidepressant. CIHI collects the information on prescription but I don't think that you will get from them the details on antidepressant (unless they would do this request as a custom request).

CCHS 2.2 nutrition

Question

When will the nutrition portion of the Canadian Community Health Survey 2.2 be released? The last information we had said August 2006.

Does a pumf for that portion of the survey need to be so different from the pumfs produced for the Nutrition Canada Survey?

Answer

We are not sure when the PUMF on Nutrition, 2.2, will be released. The Health Division was sent back to the drawing table and they are now working with Health Canada in finding what would be the appropriate roll-up for not divulgating third party information. As I mentionned before, we have to be very careful not to release information on third party that will breach confidentiality as well as it could be perceived as releasing market information (brand, consumption of one product versus another, market share, etc.). This is why they need to find the appropriate roll-up of information that will provide appropriate information and at the same time protect sensive information.

The content is quite different from other surveys and confidentiality is not only concerning identifying individuals but also related to sensitive market information related to what people eat or drink.

It's a brand new perspective and Statistics Canada has to be very careful on the information released.

Monday, November 6, 2006

CCHS 2004 cycle 2.2

Question

Is the Approximate Sampling Variablility Tables (ARROX_SAMP_TAB_E.PDF) for CCHS cycle 2.2 only available as a pdf file? I have a researcher who is looking for the file in excel format.

Answer

The Approximate Sampling Variability Tables are not available in an Excel spreadsheet unfortunately.

Wednesday, November 1, 2006

Aboriginal Peoples Survey 2001 Adults off-reserve

Question

The User's Guide for the APS (2001) Adults Off Reserve (pg. 27) "Appendix A, Rules for calculating approximate variance" refers to
"using the Excel file FindCV APS(PUMF).xls". Is the file aps2001vt.xls the file they are referring to?

Answer

You are correct. Because of our naming convention we had to change the file name from "FindCV APS(PUMF).xls" to "aps2001vt.xls".

2001 census files - can't unzip/open

Question

I've downloaded the following files from the ftp site and have received the same error for all of them when trying to unzip: "Cannot open file. It does not appear to be a valide archive. If you downloaded this file, try downloading the file again." Then when I click OK I get "Errors occured while extracting. Do you want to view the last output?" When I say Yes I get the following message: "End-of-central-directory signature not found. Either this file is not a Zip file, or it constitutes one disk of a multi-part Zip file."

I did try downloading again but I get the same messages I downloaded about 40 other files that worked fine. Any ideas as to whether I'm doing something wrong or whether the files are corrupted?

List of files that I can't unzip (ftp/dli/census/2001/ascii/topic-based tabulations/):

Families and Household Living Arrangements
95F0313XCB2001001
95F0314XCB2001001
95F0315XCB2001001
95F0316XCB2001001

Language Composition of Canada
95F0333XCB2001001
95F0336XCB2001001
95F0339XCB2001001

Canada's Workforce: Paid Work
95F0380XCB2001001

Answer

Thank you for bringing this to our attention. There may have been a slight problem when the files were loaded onto the FTP.

We noted one small point - Canada's Workforce: Paid Work 95F0380XCB2001001 - this file seems to download, but the others
(95F0380XCB2001001east.zip/ont.zip/que.zp/west.zip)
were problematic.

I am working at getting another copy of the files. I will advise you once we have reloaded them on the FTP.

Thanks again for keeping us informed about the situation and we'll try to get this done asap.

2006 Census release dates (Official re-announcement)

Please be advised that the detailed analysis, review and consultation regarding the 2006 Census release schedule and associated release dates has been completed. The originally published dates have been revised based on the impacts associated with the extension of 2006 Census field and collection activities.

The 2006 Census homepage has been modified. The link to information regarding the major 2006 Census release dates has been reinstated on the home page and the following is now presented:

2006 Census release dates
For the 2006 Census, due to the introduction of a number of automated processes, Statistics Canada envisaged releasing the population and dwelling counts earlier than in 2001 by a few weeks. However, given the tight labour market due to the strong economy in certain areas of the country (i.e. western Canada) and the difficulties that this has meant in hiring and retaining field staff, the completion of collection activities was extended by about five weeks. The impact of the extension of Field/Collection activities has resulted in adjustments to the originally published release dates. The revised release dates (as of October 31, 2006) are as follows:

Release no. 1: Tuesday March 13, 2007

  • Population and dwelling counts



Release no. 2: Tuesday July 17, 2007

  • Age and sex



Release no. 3: Wednesday September 12, 2007

  • Marital status

  • Common-law status

  • Families

  • Households

  • Housing and dwelling characteristics



Release no. 4: Tuesday December 4, 2007

  • Language

  • Immigration

  • Citizenship

  • Mobility and migration



Release no. 5: Tuesday January 15, 2008

  • Aboriginal peoples



Release no. 6: Tuesday March 4, 2008

  • Labour market activity

  • Industry

  • Occupation

  • Education

  • Language of work

  • Place of work

  • Mode of transportation



Release no. 7: Wednesday April 2, 2008

  • Ethnic origin

  • Visible minorities



Release no. 8: Thursday May 1, 2008

  • Income

  • Earnings

  • Shelter costs



We are advising that users, key stakeholders etc. who are enquiring as to the status of the 2006 Census release dates be directed to the 2006 Census home page and the link to information on release dates.

Updated Products - Geography

Please note the updated products listed below and the path to access them via the FTP.

1) Geography Boundary Files - Census 2006

The 2006 boundary files portray the official geographic limits used for census dissemination and are available for Provinces and Territories, Census Divisions, Economic Regions, Census Metropolitan Areas and Census Agglomerations, Census Consolidated Subdivisions, and Census Subdivisions. The boundary files are available in two formats: Digital Boundary Files and Cartographic Boundary Files. Digital Boundary Files depict the official boundaries of standard census geographic areas. The boundaries sometimes extend beyond shorelines into water, rather than follow the shoreline, to ensure that official limits are followed and that all land and islands are included. Cartographic Boundary Files contain boundaries of standard geographic areas that have been modified to follow shorelines. The files provide a framework for mapping and spatial analysis using commercially available geographic information systems (GISs) or other mapping software. A reference guide is included.

FTP: /dli/geography/arcinfo/
- cbf
- dbf

FTP: /dli/geography/mapinfo/
- cbf
- dbf

2) Standard Geographical Classification (SCG). Volume II. Reference Maps

The Standard Geographical Classification (SGC) is a system of names and codes representing areas of Canada. It consists of a three-tiered hierarchy - province or territory, census division, and census subdivision. This relationship is reflected in the seven-digit code. The SGC is used to identify information for particular geographical areas and to tabulate statistics. This volume provides a series of reference maps that show the boundaries, names, and SGC codes of all census divisions and census subdivisions in Canada, in effect on January 1, 2006. It also provides the names, codes and areal extent of census metropolitan areas, census agglomerations, and economic regions. A thematic map of the Statistical Area Classification (SAC) by census subdivision is included.

FTP: /dli/geography/reference-maps
- cd-csd-dr-sdr
- national-maps
- sgc-cgt

Friday, October 27, 2006

Household and the Environment Survey (HES): 3881

Question

A patron found the publication "Households and the environment 1994" (11-526) and is interested in the data which were used for this publication, which appears to be the Household(s) (and the) Environment Survey (HES) conducted in 1991 and 1994 (and again earlier this year). Any chance that they might be made available under DLI?

Answer

These files are already present on the FTP site, though it is not obvious.

The Household and Environment Survey was released as part of the Survey of Consumer Finances HIFE component:

The Microdata file for SCF 1991 (1990 Income) has data for:
1) Survey of Consumer Finances, 1991
2) Household Facilities and Equipment Survey, 1991
3) Shelter Cost Survey, 1990
4) Environment Survey, 1991

And the microdata file for SCF 1994 (1993 income) has data for:
1) Survey of Consumer Finances 1994 (1993 Income)
2) Household Facilities and Equipment Survey, 1994
3) Household Environment Survey, 1994

Trade Data

Question

A prof is asking for the following:

- Canadian exports by country by 2 digit NAICS, for 1988-present, at a quarterly frequency; also similar data but for manufacturing at the 3 digit NAICS level

- Canadian exports at the 4-digit SIC level to the countries of the U.S., U.K., Japan, China, France, Germany, and the
Netherlands (there might be a few more later), from 1985 to the latest possible year as I think SIC data ends in 2002), at an
annual frequency.

I'm at a loss here; the prof said he was referred to me by someone at STC, which suggests that I should have access to what he wants, but I'm not finding anything. Suggestions would be welcomed!

Answer

Although data is not collected at the NAICS level, it is collected at the SCG/HS/CT code level. We do have an excel spreadsheet that offers the correspondence between HS/CT code and NAICS (it is found on the FTP under dli/standard_classifications/ct-naics). Documentation is available to explain it in a little more detail.

The spreadsheet maps to the five or six digit NAICS - not two as requested.

Canadian Time Use Pilot Study 1981

Question

I have had a request from a researcher for the Canadian Time Use Pilot Study 1981. It doesn't appear in the DLI collection, and hasn't been mentioned on the dlilist. Given that it is a pilot study, I am not optimistic that there will be a PUMF. Would it be possible to get confirmation if there is a PUMF for this survey, and if there should be, might we get it through the DLI?

Answer

The PUMF was not released by STC for the Canadian Time Use Pilot Study (1981).

Alternative to QWIFS

Question

We have been subscribing to QWIFS, which I have just learned is being phased out. Can someone remind me what other institutions are offering similar access to DLI survey data? I'd like to look into the options.

Answers

1. The alternatives that I know of are:
IDLS (UWO)
SDA (U of T)
Sherlock (Quebec - but is being phased out and they are moving to IDLS until another solution is identified)
LANDRU (University of Calgary, but I am not sure if it is a subscription process)

2. Through a University grant the LANDRU interface has been improved (enhanced varaiable search capability, .por output files among other things) and we are madly adding content to catch up for the time when LANDRU was not being updated. We're getting there. There is no charge for using the LANDRU system, certainly for this fiscal year.

3. We also subscribe to UT/DLS Microdata analysis and subsetting service.
See: http://www.chass.utoronto.ca/datalib/major/sda.htm
We allow off campus access via our proxy set up

A password is not required and it does the same sort of analysis and subsetting as QWIFS but uses SDA instead of SPSS. Our faculty like it and it does neat graphs.

GPA

Question

A researcher here wants to look at connections between grade point average and socio-economic status. I've searched CANSIM, DLI collection, Statistics Canada web site and the DLI list archives, finding no results for GPA or "grade point average". Does anyone know of any surveys that collect this information? Here is the researcher's question:

"We would also like to know whether they have any survey data that collects information on GPA of students in Universities and Colleges along with their socio-economic background."

Answer

I think that you'll want to look at microdata sources and not aggregate sources, such as CANSIM. For example, the Student Financial Survey, 2001-02 has grade point average in it. This survey was conducted by EKOS Research Associates for the Canadian Millenium Scholarship Foundation.

Updated products - IALSS

International Adult Literacy and Skills Survey (IALSS)- 2003

The 2003 International Adult Literacy and Skills Survey (IALSS) is the Canadian component of the Adult Literacy and Life Skills Survey (ALL). The main purpose of the survey was to find out how well adults used printed information to function in society. Survey data include background information (demographic, education, language, labour force, training, literacy uses, information and communication technology, income) and psychometric results of respondents' proficiency along four skill domains: prose and document literacy, numeracy and problem-solving.

WEB: http://www.statcan.ca/english/Dli/Data/Ftp/ialss.htm
FTP: dli/ialss/2003

Follow up

This survey has two components:
1) International component: Adult Literacy and Life Skills Survey (ALL)
2) Canadian component: International Adult Literacy Survey (IALSS)

The data accessible in the DLI collection are from the international component, therefore will be accessible under ALL.

Once the data for the Canadian component become available, it will be accessible under IALSS.

In order to facilitate access, the two surveys on the FTP and web sites will be cross-referenced.

Imports / Exports by commodity

Question

The latest issue of both Imports by commodity and Exports by commodity have been released by Stats Can in the last few days. In the SC online catalogue "Information for libraries" says that the Export by commodity CD-Rom is available to DLI.

On the other hand, Imports by commodity is apparently not available to DLI.

Are both these statements correct? If so, why can we get Exports but not Imports?

Answer

I just received an e-mail from Tony Moren explaining that there was an error in the DLI yes/no flags in the catalogue. According to Tony, the product flag should state DSP-yes, DLI -no.

Both imports and exports products will be made available through the DSP program.

CANSIM Tables

Question

I have been exploring the Stats Canada CANSIM Tables. I have a class of Nursing PHD students coming on Tuesday October 31, and I am not sure what to say about CANSIM. Sometimes I discover a CAMSIM Table and it is free. Then, I take another path, (to get back to the table quickly), and the same CANSIM Table is only available for a price.

I would like to understand what is going on with the CANSIM Tables and how to systematically find the free access paths to the CAMSIM Tables. How many of the CANSIM Tables are free and under what conditions? How can we know in advance if the table is available free and discover the path to the free version? At this point, it all seems rather hit and miss.

Here are some examples:

EXAMPLE A: (the EXE links fail, but you can still follow the path)
When I browse to Table 102-0535, via the following steps, I am asked to pay $3.00 to see the table.

  1. http://www.statcan.ca/menu-en.htm (Stats Canada Home page)

  2. http://cansim2.statcan.ca/cgi-win/cnsmcgi.exe?CANSIMFile
    =CII/CII_1_E.HTM&RootDir=CII

    (Welcome to CANSIM)

  3. Select Browse by Subject

  4. http://cansim2.statcan.ca/cgi-win/cnsmcgi.exe?LANG=E&ROOTDIR=CII
    /&RESULTTEMPLATE=CII/CII_Subj&CIISubj=2966

    (Health)

  5. http://cansim2.statcan.ca/cgi-win/cnsmcgi.exe?LANG=E&ROOTDIR=CII/&
    RESULTTEMPLATE=CII/CII_FLst&CIITables=1887
    (Diseases)

  6. http://cansim2.statcan.ca/cgi-win/CNSMCGI.EXE?regtkt=&C2Sub=&ARRAYID
    =1020535&C2DB=&VEC=&LANG=E&SDDSLOC=&ROOTDIR=CII/&
    RESULTTEMPLATE=CII/CII_PICK&ARRAY_PICK=1&SDDSID=&SDDSDESC=

    (Table 102-0535)

  7. Select parameters and eventually arrive at a page that says: 1 payable series @ $3.00


EXAMPLE B:
When I browse to the same table the following route, I can access Table 102-0535 for free, as follows:

  1. http://www.statcan.ca/bsolc/english/bsolc?catno=84-208-X
    (Causes of Death Main Page)

  2. http://www.statcan.ca/english/freepub/84-208-XIE/84-208-XIE2005002.htm
    (Causes of Death html)

  3. http://www.statcan.ca/english/freepub/84-208-XIE/2005002/tables.htm
    (Data Tables)

  4. http://www.statcan.ca/english/freepub/84-208-XIE/2005002/tables.htm#15
    (Pregnancy, childbirth and the puerperium (O00 to O99))

  5. select CANSIM and a new window opens
    (titled http://camsim2.statcan.ca - CANSIM Table 102-0535)

  6. select parameters and continue

  7. select output format

  8. a table appears with the numbers


EXAMPLE C:
As in B above, when I start from a publication that has CANSIM tables referenced as a Related Product, I can get to the Table:

  1. http://www.statcan.ca/bsolc/english/bsolc?catno=83-237-XWE#formatdisp
    (Residential Care Facilities Product Main Page)

  2. http://www.statcan.ca/english/freepub/83-237-XIE/83-237-XIE2006001.htm
    (html)

  3. http://www.statcan.ca/english/freepub/83-237-XIE/2006001/related.htm
    (Related Products)

  4. select Table 107-5501.. Can select Parameters and a table is displayed BUT if try to access this same table by following a browse path via the SUBJECTS on the STATS CANADA home page, as described in EXAMPLE A, the notice to pay $3.00 appears.


If I happen to luck out, and find myself at an html publication featuring CANSIM links (as in Examples B & C above), I seem to be able to access the CANSIM TABLES for free. But if I want to get to these same Tables directly using the Subejct Browse path on the StatCan Home page, I seem to have to pay when I get to the CANSIM Table. So, this is all very confusing and seems inconsistent.

As I mentioned above, I have a class of Nursing PHD students coming on Tuesday October 31. I am not sure what to say about CANSIM. Well, I can say that when I do stumble on a free CANSIM Table, that the interface it quite wonderful, so, I am very motivated to understand how to know about, and get to, the free CANSIM Tables at StatsCanada.

Answer

The Health Division of Statistics Canada is trying to make as much of their data available free of charge to its users. As CANSIM is a pay-for-use product, a bypass could not be achieved via that databank. However, by creating an HTML document with links directly to CANSIM tables, the users were able to access the stats they needed without paying.

The HTML publication which provides the by-pass to CANSIM for health statistics is entitled "Health Indicators" (catalogue number 81-221-XIE, click on "view" from the main box on the page, and then use the left-hand menu to select "Data tables and maps").

National Accounts also has a similar set-up to by-pass payment to some economic tables. To access this module, visit the Statistics Canada main page (www.statcan.ca and continue in English). At the bottom right of the page, we can browse Statistics by subject - click on National Accounts, then you will see a little line at the bottom "Also available: free data tables and comprehensive information from the National Economic Accounts module" - click on the National Economic Accounts link and click on data tables from the main page.

Let us also remember that many series are available in CANSIM on E-STAT which has free access as well. If the time series do not require to be "fresh" (as the E-STAT CANSIM database is only updated once a year during the summer months), this has a practical user interface and usually meets simple student needs.

Longitudinal Administrative Databank

Question

A faculty member is interested in the LAD. Past messages on the list have noted that it's available as a custom tabulation only. Would the researcher be able to access it through a Research Data Centre, or not?

Answers

1. A listing of the files found in the RDC at the following link:
http://www.statcan.ca/english/rdc/whatdata.htm.
I did not see LAD in the listing unfortunately.

Perhaps contacting your local RDC to ask them if you can access the file through their program? The RDC analysts contact information is found at the following link:
http://www.statcan.ca/english/rdc/network.htm

2. I just checked with Donna Dosman (U of A RDC) who happened to be down here today and she advises that the LAD will never be available even in the RDC's she indicated that it is so confidential that probably only five or six people in all of Statistics Canada have access.

Ethnic Diversity Survey

Question

I have a student wanting to work with the Ethnic Diversity Survey. She wants to cross-reference survey responses to where the participants lived - either the name or the size of the town. I understand that Stats Can considered and recorded this information when determining who would be asked to participate in the survey, but can my student get at that information? If so, how? W/Could she get this information at an RDC?

She is looking at the influence of the size of cultural communities on the speed of immigrants' integration into the wider society. She wants to compare immigrants/immigrant groups in smaller towns to larger urban centres.

Answer

I took a look at the PUMF's codebook for some extra help on this one.

The level of geography reported in the PUMF is CMA (Toronto, Montreal, Vancouver, Other CMA and non-CMA). Only 42,476 responded to the survey which will make it extremely difficult to find data at a low level of geography when we consider Statistics Canada's confidentiality rules.

I took at look at the RDC's collection
(http://www.statcan.ca/english/rdc/whatdata.htm)
and EDS is listed as an available dataset. Once again, the assessment of place of residence may be a touch tricky due to the small size of the sample.

NLSCY available via DLI

Question

For the National Longitudinal Survey of Children and Youth are the cycles available to us via DLI, longitudinal in nature, or are cross-sectional versions of the long. files available in the RDC?

Answer

The NLSCY public files are cross-sectional; the synthetic files are longitudinal, but may only be used for testing programs, not
for analyses.

Profit projections

Question

a) Does Stats Canada do any projecting of profits in the future by industry?

b) BTW, I'm looking for profits of pharmaceutical firms for the past 10 years. So far, all I can determine is to use CANSIM (Table 301-0003 and 301-0006 to get revenues and then subtract cost of goods sold and employee costs to get profits. If anyone has other ideas, I would really appreciate it. There is a NAICS code for these firms (325410).

Answers

1. a) As far as I know, Statistics Canada does not project/forecast profits for industry. The only thing we actually forecast is population projections.

2. b) If you subscribe to the Financial Post corporate database, that does contain a profit field (expressed as a percentage).

3. b) Another alternative is to use Strategis's Performance Plus. Industry Canada's web site has a decent interface and your user can pull some stats for 1993, 1995, 1997, 2000, and 2002. The profit aspect is already calculated for you and you get a lot in information of interest (COGS, salaries, etc.).

You can access Performance Plus from the following web address:
http://www.sme.ic.gc.ca
and click on "Create your own profile" from the main body of the page. Stats for 1993 - 1997 are based on SIC codes and those for 2000 and 2002 are based on NAICS. You can find the concordance between the NAICS code and the SIC code by using the following web page:
http://www.statcan.ca/english/Subjects/Standard/concordances/
naics02-to-sice80.htm

(from what I can tell, 325410 is divided into 1) E3712 *Industrial Organic Chemicals n.e.c. and 2) E3741 * Pharmaceutical and Medicine Industry.)

Statistics Canada's Small Business Profiles are also part of the DLI collection, but it sounds like the Performance Plus approach should meet your needs for now.

E-Stat Links

Question

The location of E-stat has changed and we're updating our links. Should we link to the "accept and enter" license page at
http://www.statcan.ca/english/Estat/licence.htm

Or is it acceptable to link right to E-Stat at
http://estat.statcan.ca/cgi-win/cnsmcgi.exe?LANG=E&ESTATFILE
=/ESTAT/DATA.HTM
?

I would rather do the latter - they will be proxied links so off-campus users will have to enter their Malaspina student or employee IDs and PINs to access E-stat.

Answers

1. There are at least two factors to take into consideration regarding a local policy of jumping beyond the E-Stat licence page at
your institution. First, E-Stat is available to the public through participating DSP libraries. These libraries need to stipulate the
terms of the E-Stat licence for public usage and it seems most easily done through the E-Stat licence page. Users need to be informed of the licencing terms and while this could be done by providing your own licence page that then jumps beyond the E-Stat licencing page, I don't see any advantage to this approach.

Secondly, E-Stat is also available through DLI and many of us use our own pages to inform our campus users about the terms of the DLI licence. If your institution is a member of _both_ the DSP and DLI, you will want to decide if you should treat these two user communities (i.e., the public and your DLI constituents) differently in how you inform them about the terms of the E-Stat licence. All of our campus users are also part of the public; so you might decide to use the E-Stat licence page and treat both constituencies as one, namely, the public.

Alternatively, you may decide to provide separate access points for your public and campus users or your institution may not be participating in the DSP access to E-Stat. If you are providing access to E-Stat only under the DLI licence and you provide your own prior page describing the terms of the DLI licence, you might decide (for reasons as you indicated below) to jump beyond the E-Stat licence page.

The key is that our users be informed of the terms of the licence prior to using the actual product and, having been informed, their next steps in using the product implies that they agree to abide by these terms.

The DLI Executive group responsible for interpreting licence questions has not yet formally addressed your inquiry; so this statement simply expresses my opinion. But it is the line of argument that I'd take within this group.

2. We can get straight to E-Stat from the university through the proxy server. Transparent to the user.

From home, you need an id and password. Click and enter doesn't work.

3. Last year, there were three tabs on the main page: Overview, Articles and Data. This year, Overview and Articles were combined into a single tab whose file name is Art, and Over is gone. We also made Data the "landing" page; last year the landing page was Overview. One of the reasons for all of these changes is that, now that most publications are free, there is much less need to have E-STAT house its own version of many pages that were for fee last year.

Please link to the E-STAT launch page at:
http://www.statcan.ca/english/Estat/licence.htm.
Students should always enter via the licence page, accepting the agreement. From the E-STAT side-bar, users can access User Guides, CANSIM, census and Search map 2001.

4. The first time users should read the licence and click Accept and Enter. Then from the Table of contents, they can click on Search CANSIM, Search Census and Search map on the E-STAT side-bar.

Frequent users can Accept and Enter the terms of the licence or go directly to the E-STAT side bar.

We designed the page this way so that high school students would not be intimated be the entry process. We expect high school teachers to explain the licence when introducing E-STAT to their students.

Chi square test and census pumf individuals

Question

A researcher has used a stratified cluster sample for Census 2001 Individuals data. As he explained it to me and as I think I understand it, this type of sample cannot be analyzed with SPSS. He mentioned that in censuses of the past there were certain tables that were used but which have now been replaced with a conversion factor.

He has studied the documentation around page 180, but there is no explanation of how to analyze the results by using the chi square test which is the one the researcher has planned to use. Can anyone help with this?

Answer

Our methodologist has provided the following information that was requested.

If the researcher took his sample from the 2001 PUMF, then the conversion factors are the factors he's looking for (these are design-effects). In previous censuses, we were calling them "Quality Factors". But they are the same thing. Their function is to convert, by squaring them, simple random sampling variances into variances obtained from the PUMF design plan. We modified the way it was introduced to simplify basic estimations.

The researcher is asking something not simple, though. In fact, simple random chi-square statistics are bias because of the PUMF sampling plan (as the researcher rightfully noted). To transform or to convert these statistics into chi-squares that take into account the sampling plan, he has to divide his chi-square statistics by the corresponding design-effects (conversion factors). But this is only an approximation [see 1, pp. 500 - 507]). The resulting statistics will behave better under the null hypothesis.

References:
[1] Särndal, C.E., Swensson B., Wretman J. Model Assisted Survey Sampling. Springer-Verlag. 1992.

Survey of Electronic Commerce and Technology (SECT)

Question

A faculty member has asked me for the most recent "full" SECT report. I referred him to the April 20, 2006 issue of The Daily and the CANSIM tables mentioned in the text. However, it turns out that he is interested in getting access to Atlantic Canada and SMEs specifically while Canada seems to be the only level of geography available from the CANSIM tables to which I have access.

Answer

If the level of geography you seek is not available through CANSIM, perhaps the author division can create a customised product with the level of geography desired. Unfortunately, this will be at a fee if the data is available.

Official release of the 2006 Standard Geographical Classification (SGC) and 2006 Boundary Files

Official release of the 2006 Standard Geographical Classification (SGC) and 2006 Boundary Files - October 18, 2006

The 2006 Standard Geographical Classification and the 2006 Boundary Files for selected geographies are released today. Both products are available free of charge in electronic format only:
http://www.statcan.ca/start.html.

The final version of the 2006 SGC Volume I and the print version of the 2006 SGC Volume II will be released on January 16, 2007. Boundary files for the remaining geographic units (except urban area and designated place UA/DPL) will be released on February 14, 2007.

2006 Standard Geographical Classification (SGC):


The Standard Geographical Classification (SGC) is a classification of geographical areas used to collect and disseminate statistics. The 2006 edition replaces the 2001 edition as the official classification for geographical areas for the 2006 Census and other Statistics Canada surveys.

The classification is organized in two volumes: Volume I (Preliminary) The Classification and Volume II Reference Maps.

Volume I (preliminary) released today, contains tables of the names and codes of standard geographical classification units, organized by province and territory and by metropolitan area. Designed as a reference and coding tool, Volume I (preliminary) is available in PDF (12-571-PIE) and HTML (12-571-PWE) formats. The final version of Volume I, to be released on January 16, 2007, will contain additional tables as well as concordances between SGC 2001 and SGC 2006.

Volume I can be accessed at:
http://www.statcan.ca/english/Subjects/Standard/sgc/2006/2006-sgc-index.htm

Volume II (12-572-XWE) contains reference maps showing boundaries, names, codes and locations of the geographical areas in the classification. The reference maps show census subdivisions, census divisions, census metropolitan areas, census agglomerations, and economic regions. The maps can be downloaded for free in PDF format. On January 16, 2007, Volume II will also be released in print version.

Volume II can be accessed at:
http://www.statcan.ca/english/Subjects/Standard/sgc/2006/2006-sgc-index2.htm
The maps can also be accessed directly at:
http://geodepot.statcan.ca/Diss2006/Maps/SGC2006_e.jsp

A summary reference guide explaining the methodology behind the creation of these maps is also available (in PDF (92-149-GIE) or HTML (92-149-GWE) format):
http://geodepot.statcan.ca/Diss2006/Reference/Freepub/92-149-GWE/92-149-GWE2006001.htm
Please note that the summary reference guide released today will be undergoing improvements and enhancements in order to expand content. Subsequent versions of the reference guides will accompany the upcoming reference map releases in January and March 2007.

2006 Boundary Files:

The 2006 Boundary Files (92-160-XWE) portray the geographical limits used for census dissemination and provide a framework for mapping and spatial analysis. The geographical areas covered are those of the 2006 Standard Geographical Classification. There are two types of boundary files: digital and cartographic. Digital files depict the full extent of the geographical areas, including the coastal water area. Cartographic files depict the geographical areas using only the major land mass of Canada and its coastal islands. The files are available in three formats: ArcInfo® (.shp), MapInfo® (.tab) and Geography Markup Language (.gml).

Boundary Files are available for the following geographic units:

* Province/Territory
* Economic region
* Census division
* Census metropolitan area/Census agglomeration
* Census consolidated subdivisions
* Census subdivisions

Three supplementary hydrography layers are also available as a reference layer for the Cartographic Boundary Files (CBF). The following three hydrography layers have been released:

* Lakes and rivers (polygon)
* Rivers (line)
* Coastal waters (polygon)

The 2006 boundary files (3 formats) can be downloaded for free from the following link and are only available for download at the national level:
http://geodepot.statcan.ca/Diss2006/DataProducts/BoundaryFiles_e.jsp

A summary reference guide is also available (in PDF or HTML format):
http://geodepot.statcan.ca/Diss2006/Reference/Freepub/92-160-GWE/92-160-GWE2006001.htm
Please note that the summary reference guide released today will be undergoing improvements and enhancements in order to expand content and will be reissued in November. Subsequent versions of the reference guides will accompany the upcoming boundary file releases in February and March 2007.

Please note:

1) Users of Netscape and Firefox Internet browsers may have experienced difficulties with the download of the 2006 Road Network File. This technical issue has now been resolved on all download pages, however if users continue to experience difficulties please contact GEO-Help at 1-613-951-3889 to obtain a copy of the data on CD-R.

2) MapInfo users with versions that pre-date MapInfo version 8.0 may experience difficulties opening the Province/Territory and Economic Region cartographic boundary files. This is due to earlier versions of MapInfo having lower restrictions on the number of nodes per object and/or a multi-polygon object limit per record.

3) Any boundary challenges proposed as a result of this release will be dealt with on a case by case basis. For more information please contact GEO-Help at 1-613-951-3889.

Upcoming releases:

January 16, 2007: Reference maps - CSD/DA (without UA/DPL), C(M)A/CT (without UA/DPL), CT/DA, Non-tracted CA/DA (without UA/DPL)

February 14, 2007: Reference products (Illustrated Glossary, Geography Catalogue, Census Dictionary), Boundary files (CT, DA, DB (Dissemination Block), FED2003), Ranked Road Network File (RRNF), Correspondence Files, and Geographic Attribute File (GAF) and GeoSearch (without population and dwelling counts).

March 13, 2007: Population and dwelling counts - highlight tables, Thematic release maps - population and dwelling counts, Geography catalogue (2nd edition), Reference maps - CSD/DA (with UA/DPL), C(M)A/CT (with UA/DPL), Non-tracted CA/DA (with UA/DPL), Boundary Files (UA & DPL), GeoSuite, and Geographic Attribute File (GAF) and GeoSearch (with population and dwelling counts).

Thursday, October 26, 2006

Aboriginal Youth Statistics

Question

I have a prof who is looking for educational statistics for Aboriginal youth since 2001: specifically the number of graduates nation wide, by year.

I did a fair bit of research on Aboriginal youth in the spring, but couldn't find anything that wasn't based on the 2001 Census or the 2001 Aboriginal Peoples Survey.

Does anyone know of any stats that might be available?

Response

I did perform a search with our Education Division and the data is not available through them. I am not sure who would collect that information outside the Census and the Aboriginal People's Survey.

Provincial Income Tax Rates

Question

Is there a publication or info source where a student could find the provincial personal and corporate tax rates since 1990?

Would this involve going to each provincial revenue department and making an inquiry?

Answers

1. Personal back to 1998:
Personal -
http://www.cra-arc.gc.ca/tax/individuals/faq/taxrates-e.html#provincial

Current corporate -
http://www.cra-arc.gc.ca/tax/business/topics/corporations/rates-e.html

2. I usually go to the print for this question. Most of the annotated income tax acts (e.g. Butterworths, CCH) will include a couple or few years of the rates at the beginning of the book. It means going to a few different yearly editions, but at least all the provinces are in one place!

Preview of products and services, 2006 Census

The 2006 Census dissemination project is pleased to announce the official release of the Preview of products and services, 2006 Census, catalogue no. 92-565-XWE.

The Preview of products and services offers a complete overview of the proposed products and services that will be released based on the 2006 Census of Population and 2006 Census of Agriculture results. Information (where applicable) will include major characteristics and content, "What's new?" in comparison to 2001, levels of geography, availability/delivery methods, release timeframe and pricing.

The preview is now exclusively an Internet product for 2006 and is no longer available in a formalized print format (i.e. newsletter publication); however, a "print-friendly" format is available via the Internet. This product will be updated periodically as details regarding products and services become finalized.

As of 8:30 a.m. (Ottawa time) on October 17, 2006, this product will be available in HTML and PDF on the Internet and can be found by clicking on the Census button located on the sidebar of the Statistics Canada home page, under the heading Recent releases, as well as from the 2006 Census homepage.

Thursday, October 12, 2006

Health survey data linkage

Question

A UBC researcher is interested in the fact that the National Population Health Survey and the Canadian Community Health Survey both contain questions in the final administrative section as to whether respondents are willing to allow linking to their recorded use of provincial health services and whether they are willing to permit sharing the survey data with the province. The researcher is specifically interested in knowing the percentage of respondents who agree to this for each survey. She used the term 'data linkage permission rates.'

Is this something that can be found?

Answer

The percentage of respondents who agree to share varies according to the survey and the cycle, but in general, around 95% of them agree. Then, for the percentage of respondents who agree to link, about 95% of these who agreed to share say "yes" to the question. Note that the data for these questions are in the confidential files (Master and Share), not the PUMF. If the researcher is interested to access these files, please inform him of the usual ways (RDCs, remote, custom tabs, etc.).

Monday, October 2, 2006

CCHS 3.1 Documentation

Question

We have been looking at the documentation for the CCHS cycle 3.1, sub-sample

Something does not quite match, namely the topical index and the data dictionary. For sub-samples 2 and 3 the number of pages referred to in the topical index equals the number of pages in the data dictionary. Not so for sub-sample 1: the data dictionary is 313 pages long and the "CCHS Cycle 3.1 - Sub-Sample 1: Topical Index, Public Use Microdata File" states that the
weights are on pages 621 and 622. Obviously the variables found in the data dictionary and its index do not match up.

It is more similar but still not the same as the topical index for the master file - 12 month, which I found on the website at
http://www.statcan.ca/cgi-bin/imdb/p2SV.pl?Function=get
Documentation&AC_Id=27501&AC_Version=4&ul=ul&lang=en&db=IMDB
&dbg=f&adm=8&dis=2


Could someone please double-check this?

Answer

The file that you are referring to is "English DD Topical Index.pdf". This file belong's to the synthetic file(s).

Please use "English DD Alpha Index.pdf" until this Topical Index file is replaced.

Sunday, October 1, 2006

Downloading data files from Firefox or Netscape

Obviously something has changed inside Statcan that now only lets Internet Explorer bring up the html page, explaining dli contacts
can download data files.

I have temporarily taken out the warning page until I can figure out what has changed.

The Web will still check IP,. userid and password before the data can be downloaded.

Friday, September 29, 2006

Updated Products - CBP

Canadian Business Patterns - June 2006

The Canadian Business Patterns contains data that reflect counts of business establishments by: 9 employment size ranges, including "indeterminate" (as of December 1997); geography groupings: province/territory, census division, census subdivision, census metropolitan area and census agglomeration; and Standard Industrial Classification which classifies each establishment in Canada into a specific industry (tables at the 1, 2, 3 and 4-digit level). Since the

December 1998 reference period, these data are also presented using the North American Industry Classification System (tables at the 2, 3, 4 and 6-digit level). A concordance table showing the relationships between both classification systems is included with the product.

WEB: http://www.statcan.ca/english/Dli/Data/Ftp/cbp.htm
FTP: dli/cbp/2006

Question & Answer

Q: Why on the web version is there just a small file stating that it has to be downloaded by DLI contacts? This really doesn't allow us to download it from the web using a password. There is no way of downloading it from there, you have to go to the DLI FTP site to actually get it. I went to the FTP site and got the file, but it seems that the web link is actually misleading.

A: When you click on a file with a little lock next to it (data files), it brings you to another web page explaining the restricted access. There is a link at the bottom of the explanation page to access the collection. When you click on the link, a pop-up screen appears and requests your user name and password - just enter your contact info - same as the FTP site.

Economic Regions

Question

In certain Statistics Canada publications, such as 71-001, statistics are given for "economic regions" within provinces and territories. However, there is no map in this publication indicating the boundaries of the economic regions. I know I have encountered them both in print and electronic sources, but I can't recall where. Could someone please jog my memory?

Also, could someone from Statistics Canada who is reading this message please contact the appropriate Statistics Canada unit that publishes 71-001-PPB and urge that the map of the economic regions be included. Because without it, data for economic regions are incomplete if not meaningless.

Answers

1. I can tell you where the map should be available on the STC website but isn't. Specifically, it should be under the "Definitions, data sources and methods", "Geographic classifications", "Geography 2001" section:
http://www.statcan.ca/english/Subjects/Standard/sgc/2001/2001-er-classmenu.htm

This page has links to maps, which should show the Economic Regions but instead links to maps of the provinces and the wider Region names (not the economic regions within a province.)

Since Economic Regions are a standard geographic concept, one would expect to find their visual representation under the "standards" section of the STC website.

Where you will find the ER map is under the geography reference maps of the Census.

http://geodepot.statcan.ca/Diss/Maps/ReferenceMaps/index_e.cfm

2. There is also an alternative: The Guide to the Labour Force Survey (71-543-GIE) tells you which CDs compose the economic region boundaries.

3. This is in reference to your request to have economic maps included in our monthly publication (71-001). A few years ago, the maps were included in the HTML version of the publication but we had to remove them because of a new rule/policy stating that whatever was available in the HTML version also had to be available in PDF. Since each economic region and census metropolitan area required a page and there are many regions, this would increase the paper count of our publication three-fold, increasing the cost of the paper publication.

We have found another way, however, of providing the maps. At the end of the publication, we will include a link to the Geographic regions on the website, including census metropolitan areas, economic regions and employment insurance regions. This will be available in the ' Data Quality' section of the 71-001 publication. The paper (PDF) version will have the actual address of the maps on the STC website. The HTML version will have a direct link.

We are working on it now and it should be available in 71-001 for the October 5th LFS release of September data. If not, it will be available for the next release in November.

Thursday, September 28, 2006

Pregnant Workers

Question

I've had a request for information on pregnant workers in Canada. Specifically, a count or estimate of the number of pregnant workers in any one year, and also whatever demographic information I can find, preferably at the provincial level or better.

I've found information on number of women taking maternity leave, but I've been unable to find anything that would let us estimate the number of women who work during some part of their pregnancy, either full or part-time, who don't have maternity leave. Most surveys of the general population don't seem to ask about pregnancy.

Is there anything obvious I'm missing?

Answer

We have a few surveys which might be of interest, but the vintage is a little old :

1) The Absence from Work Survey (AWS) was an annual supplement to the Labour Force Survey (LFS) from 1977 to 1998. It asked employees about work absences of at least two weeks duration due to "illness, accident or pregnancy." Detailed information on duration and type of compensation received was collected for the most recent absence. Available at:
http://www.statcan.ca/english/Dli/Data/Ftp/aws.htm

2) The Maternity Leave Survey (1985)
http://www.statcan.ca/english/Dli/Data/Ftp/mls.htm

But I also found some interesting data in the Daily release of Employment Insurance Coverage Survey (2004)- a table called Eligibility of mothers for maternity and parental benefits and duration of leave (http://www.statcan.ca/Daily/English/050622/d050622d.htm).
This will not provide you with the demographics of these females, but it will help answer your second set of questions. Unfortunately the survey does not have a PUMF and is not part of the DLI collection, but perhaps the aggregate tables will be of use to you.

Monday, September 25, 2006

Same-sex couples in B.C. who have children in school

Question

A patron is wanting statistical information on the number of same-sex couples in B.C. who have children in school (elementary or secondary).

In the Census 2001 Topic-based Tabulations>Marital Status of Canadians, the table # 21 (see below*) allows you to select a view for common law status/with partner of same sex. But the age grouping categories are only for 15 and up. Is there a similar table for the under 15 age groupings?

Is there any way to find/determine:
- the number of same-sex couples in BC who have school age children?
- the number of same-sex couples in BC who have school age children attending school?

Conversely, is there any way to find/determine:
- the number of school age children in BC living with same sex couples
- the number of school age children in BC, attending school in BC, and living with same-sex couples.

*Table 21 - Legal Marital Status (6), Common-law Status (5), Age Groups (12A), Sex (3) and Household Living Arrangements (11) for Population 15 Years and Over, for Canada, Provinces and Territories, 2001 Census - 20% Sample Data - Cat. No. 97F0004XCB2001040 This product was updated on January 19, 2004.A patron is wanting statistical information on the number of same-sex couples in B.C. who have children in school (elementary or secondary).

Answers

1. Here is some information related to your request provided by the Census Help Desk at Statistics Canada:

I have checked a few sources and there is nothing that detailed for same sex common-law partners in standard products. The only additional information in a cross classification that is available is the following table.

http://www12.statcan.ca/english/census01/products/standard/themes/RetrieveProductTable.cfm
?Temporal=2001&PID=59305&APATH=0&GID=355313&METH=1&PTYPE=55496&THEME=39&FOCUS=0&AID
=0&PLACENAME=0&PROVINCE=0&SEARCH=0&GC=0&GK=0&VID=0&FL=F&RL=0&FREE=0


Data can be produced as a custom tabulation for same sex couples for populations over 5000. If they are looking for the province of BC then that would meet the criteria.

Updated Products - Justice

Justice Beyond 20/20 Tables

Thank to the Canadian Centre for Justice Statistics Division, we believe that the DLI Justice collection is now up to date as of August 2006.

Some of the data were updated and some of the tables were completely removed.

To help in the identification of terminated tables, replacement tables, etc. , the Justice "readme" and web page were updated as well.

We hope this will help in identifying the proper tables to help your users.

WEB: http://www.statcan.ca/english/Dli/Data/Ftp/justice.htm
FTP: dli/justice/b2020/data

Thursday, September 21, 2006

1961 PUMF

Question

Would someone have created the system file for the 1961 PUMF?

Answer

We do not have access to the PUMF for 1961. I assume that was a reference to the basic cross-tabulations for the 1961 Census - that is all I can find on the FTP.

We'll make sure the FTP clarifies this fact by placing the documents in a BST folder or something of the sorts.

Travel Activities and Motivation Survey (tams)

Question

A patron is asking for the Travel Activities and Motivation Survey (tams). He has asked specifically for the data for a 2006 survey and specifically mentioned June 2006. The DLI FTP site has a /tams/1999/ folder with the date of 8/4/06. I have not found reference to a later survey. Is 1999 the most recent tams survey?

Answer

The most recent data that exists for this survey is 2005. They are going to the Policy Release Committee in October and hope to release in late November. No exact date as of yet.

Note the use of "hope to release" - remember that these dates are guidelines and may be pushed back (read: will probably be pushed back), so advise the user!

CSGVP or NSGVP 2004

Question

Will the 2004 Survey of Volunteering, Giving and Participation be made available to DLI?

Answer

It is not out yet, but the PUMF is scheduled to be released in November.

Release of GeoSuite 2001 Downloadable Database

GeoSuite 2001 Downloadable Database

The downloadable database entitled GeoSuite, 2001 Census (Geography Products: Geographic Data Products) has been released in The Daily.

The files have been used to replace the ones on the FTP site. The new version of the database is geosuite-2001-version3.zip .

Updated Products - PEA

Public Economic Accounts

This product provides a regional perspective on Canadian economic developments. It includes separate sets of statistical tables, organized in a manner similar to those in the Income and expenditure accounts, for each of the provinces and territories, catalogue no 13-001-PPB. The focus is on each region's gross domestic product, final domestic demand, personal disposable income and government sector accounts.

All data was updated except prv.exe - it will be available in November 2006.

FTP: /dli/pea
WEB: http://www.statcan.ca/english/Dli/Data/Ftp/pea.htm

Wednesday, September 20, 2006

SAS Question

Question

I'm taking the SPSS versions of some of the LMAS files and saving them to .sas7bdat format for a graduate student; however, when I run the export, an error message comes up

Warning # 9077
The cumulative length of the variable labels exceeds the limitations of the target file type. The excess labels will be omitted.

Is there any other way I can give her the data for use in SAS that won't have this problem? How serious is this?

Answer

One easy way to do it would be to create a .csv file from the SPSS - and I can send you a perl script that converts "most" of the SPSS syntax into SAS syntax.

Household Internet Use Survey 2002 codes

Question

In the HIUS2002 file, the variable QUARTILE has values ranging 1 - 9 BUT in the codebook there are only definitions for 1-4. Would anyone have a clear definition for 5-9 that I could pass on to our researcher?

Answers

1. I have just ran a frequency on the variable QUARTILE using "hius2002.sps" over this dataset "HIUS2002_Microdata.txt"

Here is my result:

Income Quartiles




















































Frequency



Percent



Valid Percent



Cumulative Percent



Valid



Quartile 1



8332



26.3



26.3



26.3



Quartile 2



8218



26.0



26.0



52.3



Quartile 3



7883



24.9



24.9



77.2



Quartile 4



7217



22.8



22.8



100.0



Total



31650



100.0



100.0




2. That's interesting - I get only values 1-4, with the following frequencies:

Income Quartiles
Percent N Value Label
26.3 8,332 1 Quartile 1
26.0 8,218 2 Quartile 2
24.9 7,883 3 Quartile 3
22.8 7,217 4 Quartile 4

Are you sure your researcher is using the right syntax file with the right data?