Monday, December 22, 2008

General Social Survey 21 - 2007 Release Date


A grad student here, at the University of Manitoba, has inquired about the General Social Survey Cycle 21. The DLI Product Release Dates page indicates that this was expected to be released in November 2008, but I don't see this listed on the DLI Browse the Collection on the Web page. Could you please let us know if there is a new projected release date for this survey? Thanks in advance for looking into this.


I just found out this morning that the GSS Cycle 21 won't be be available before spring 2009. They are not sure of the date. They were hoping to have it released before but they still have more work to do for the approval by the microdata release committee so it has delayed the released. We will probably have a better idea towards the end of January 2009 as when they will be able to release it.

We will change the date on the DLI Product Release Page.

Labour Force Survey (LFS) Variables Inquiry


I have learned that the PUMFs for the Labour Force Survey do not include "immigrant status", even though the question is now asked on the Survey.

Is there anything that could be done to have this variable added?

The immigrant status has never been included on the LFS PUMFs even if the question has been asked since January 2006.

As presented by Jason Gilmore at the Spring 2008 EAC meeting, as they are reviewing the LFS PUMF, this is one information they are considering to include in the LFS PUMF. This was strongly recommended by the EAC members and other people with whom they have consulted. I talked to Jason and he is not sure yet when the revision will be done. They are still working on alternative for the LFS PUMF content and will certainly be giving an update at the next EAC meeting in April 2009. Jason will keep us updated on any coming decision concerning the content of the LFS PUMF.

Confirmation of understanding on Aboriginal Identity and Education


I was just looking at Table 97-560-X2006028 (Aboriginal Identity (8), Highest Certificate, Diploma or Degree (14), Major Field of Study - Classification of Instructional Programs, 2000 (14), Area of Residence (6), Age Groups (10A) and Sex (3) for the Population 15 Years and Over) with a researcher and would like to make sure we have a clear understanding of what we are seeing.

For all of Canada, there are 25,664,220 people of Aboriginal and non-Aboriginal identity population, 15 years and over (excluding those in institutions) who were counted. Of those, there were 6,098,330 with no certificate, diploma, or degree and 19,565,900 with a certificate, diploma, or degree. There are 2,785,420 with an apprenticeship or trades certificate or diploma. So, those 2,785,420 would be included in the 19,565,900 figure and would not be included in the 6,098,330 figure.

Is our understanding correct?


Yes, you have the right interpretation.
If you click on the link you sent in your message then click on the variable Highest certificate, diploma or degree (14), the hierarchy of this variable will be obvious with the indentation of each category within that variable.

Wednesday, December 17, 2008

Agriculture-Population Linkage Data for Small Geographies


On Tuesday December 4th, The Daily includes an article about the Agriculture-population linkage data for the 2006 Census. Some of this data is available on the Census website, for larger levels of geography. A researcher at the University of Manitoba is interested in finding out if this data for small levels of geography (Consolidated Census Subdivisions) will be available through the DLI, and if so, about when can we expect
this be available? Thanks in advance for any information you might provide.


Data resulting from the Agriculture-Population linkage is and will only be available at the provincial and national levels.

Data from the Census of Agriculture as a whole - containing farm and farm operator data - is available to a low level (CCS level for free via our internet site). The Ag-Pop linkage takes only those farm operators who filled in the long census of population form (20% of the population) and applies weights to make it representative of the whole farm population. Because this database is based on a 20% sample of the entire farm population the confidentiality concerns and quality requirements does not permit distribution of
the data at sub-provincial levels.

Tuesday, December 16, 2008

CRA Tax Data


I have a faculty member here at University of Guelph who has traditionally worked with subprovincial tax data for Ontario. He's noticed that the CRA website is no longer providing the data at the subprovincial level nor are they keeping (on the web) data beyond a 5 year period. As you can imagine this is an issue for any researcher looking to conduct any historical analysis.

So - my questions are: did anyone in the DLI world collect and keep this data as it was released? And for the folks at DLI - I'm not sure if this is out of your mandate - but is there anyway for you to ask about the historical data? and subprovincial data?


We did check with CRA and this information is now available only through as custom tabulation from CRA under cost recovery. If you want to pursue this request, this is how you can proceed based on the response we got from CRA:

For income statistics, we have data going back to 1984. Again, most of these are text files, though some recent years are in Excel or PDF.

What years of data do you require? When you determine which years you wish to receive, you may send a request to Client Services at . They will assign a you a request number, and I can provide you with a cost estimate and delivery date.

Monday, December 15, 2008

Sex Orientation of Respondent Data from GSS18

I have a student looking for data of the Sexual Orientation of Respondent from GSS18. She is specifically looking for the data of question ?SOR_Q110 in Section 14 of the questionnaire. I checked the
codebook and there are three questions in the Sexual Orientation of Respondent Module, but somehow I can't find the any correspondent variables from the Data Dictionary. Are these questions used to derive
any variables? Or have they been suppressed somehow?

This variable has been suppressed from the PUMF because of confidentiality reasons as well as data quality reasons. This survey was one of the first ones to ask about sexual orientation and this resulted in low response rate for this question.

Information Technology in Higher Education


I have a grad student who's looking for data about the increased use of information and communications technology in post-secondary education, among students and instructors. I've found a some numbers from the U.S., but nothing specific to Canada. CIUS focuses on home users, and surveys like HERDS deal more with R&D spending.


I have consulted with the relevant divisions and it appears that Statistics Canada does not produce the type of data you mention below.

Wednesday, December 10, 2008

Policy on Codebooks for RDC Datasets


I am part of a team looking at the National Survey of the Work and Health of Nurses (NSWHN) survey. At our initial meeting at the RDC in Toronto, we were told that the codebook (containing the data dictionary with frequencies for all variables) could not be made available to us outside the RDC.

My question is: why not?

Other RDC-available surveys make data dictionaries freely available on the Web. Could someone tell me why an exception is being made for this survey? Even just a record layout, without any associated frequencies, would be useful. It would make life so much simpler, especially for a research group scattered clear
across the country.


I talked to the subject matter and it's not general practice to have codebooks and data dictionaries from master files available to DLI. Only a few subject matters did and only after ensuring that there is no confidential data released. The author division had no intention of making this codebook for the Health Nurses Survey outside of the RDC. They may go back and look at the codebook to see if they can release but at the present time, they don't have any resources to do so and don't know when they would be able to do so.

For most author divisions, they only release the codebook for PUMF which ensure that no confidential data are released. What is on our Web are the PUMF codebook and only a few exceptions have masterfiles codebook.

Tuesday, December 9, 2008

Looking for Educational Outcome Data


A student here is looking for data that will allow her to find educational outcomes for all members of a family, even those who are no longer living within the same household. She wants to test the theory that the highest level of education achieved by each child depends in part on the number of children within the family.

I’ve come up dry: do you know of any data that would allow her to do this study?

She has considered, and rejected, using the NLSY from the United States, since many of the children of the 1979 cohort have not yet completed their education. Would the Panel Study of Income Dynamics let her do this?


I just wanted to let you know that I did check with our Education Division on this one, just to be sure. They confirmed that the data are not available through them.

Monday, December 8, 2008

Education Statistics

I don't think I have overlooked anything obvious but you never know!

I have a faculty member who wants the number of degrees granted by field of study by province by gender for as recently as possible if not he can use Enrollment number by field of study by province by gender for as recently as possible.

CANSIM -477-0011 to 477-0014 does not have the field of study 81-229xib is only up to 2000 and does not have a detailed enough breakdown of field of study

On the CAUT website, in their Almanac, for 2003-2004 they have degrees by field of study by gender but only for the whole of Canada (Table 3.11) I need this by province!

CAUT gets its stats from the Centre for Education Statistics. Should I e-mail them or can DLI get them for us?? Alternatively how much would a custom tab be??

Any help greatly appreciated.

Please find the reply from the division below.

Please see my response below.

1) In CANSIM, the information is provided my field of study however the data is based on Classification of Instructional Programs (CIP) now and the primary categories are available.
This is available in Table 477-0014 University degrees, diplomas and certificates granted, by program level, Classification of Instructional Programs, Primary Grouping (CIP_PG) and sex, annual (number)

2) If further detail is required on the CIP, it is available for 2, 4 and 6 digit level from 1992 to 2004 on a cost recovery basis only. The following link provides information on the classification system used.

To provide a custom table by CIP, by province, by gender for 1 year of data would be $165 with each additional year costing $15.

Caution must be used however if a historical comparison is being done as it cannot be done at the 4 or 6 digit level. As about 40% of the institutions still report using the old system, data cannot be used at this detailed level for historical analysis. If historical analysis is done, it must not be done lower than 2 digit level.
2 digit CIP is still more detailed than CANSIM. It contains the following groups:

Field of study (2 digit) Code
Agriculture, Agriculture Operations and Related Sciences 01
Natural Resources and Conservation 03
Architecture and Related Services 04
Area, Ethnic, Cultural and Gender Studies 05
Communication, Journalism and Related Programs 09
Communications Technologies/Technicians and Support Services 10
Computer and Information Sciences and Support Services 11
Education 13
Engineering 14
Engineering Technologies/Technicians 15
Aboriginal and Foreign Languages, Literatures and Linguistics 16
Family and Consumer Sciences/Human Sciences 19
Legal Professions and Studies 22
English Language and Literature/Letters 23
Liberal Arts and Sciences, General Studies and Humanities 24
Library Science 25
Biological and Biomedical Sciences 26
Mathematics and Statistics 27
Military Technologies 29
Multidisciplinary/Interdisciplinary Studies 30
Parks, Recreation, Leisure and Fitness Studies 31
Basic Skills 32
Leisure and Recreational Activities 36
Personal Awareness and Self-improvement 37
Philosophy and Religious Studies 38
Theology and Religious Vocations 39
Physical Sciences 40
Psychology 42
Security and Protective Services 43
Public Administration and Social Service Professions 44
Social Sciences 45
Mechanic and Repair Technologies/Technicians 47
Precision Production 48
Transportation and Materials Moving 49
Visual and Performing Arts 50
Health Professions and Related Clinical Sciences 51
Business, Management, Marketing and Related Support Services 52
History 54
French Language and Literature/Letters 55
Dental, Medical and Veterinary Residency Programs 60
Other instructional program 89

If a custom table is required, turn-around time for delivery is 5-10 working days and pre-payment is required.

Data on driving, or car ownership and recycling or energy saving programs


This request for help came while I was at ACCOLEDS.I wonder if anyone has any ideas of StatCan sources that might be helpful for the Grad Student. (The student’s paper is due imminently.)

My thesis is the following: Is there a relationship between environmentally harmful behaviours (e.g. driving) and participation in energy savings programs? In other words, I am asking the question: Do people feel guilty for driving or other polluting activities and try to do something to make up for it?

Instead of energy saving programs, this could be replaced by "recycling programs." There is some helpful information in the SDA tool which I am comfortable with. I am trying to determine if there may be any other studies by Stats Can on areas relating driving, car ownership, recycling programs, or energy saving programs (energy audits), etc.


With respect to my previous email, I expect the student is using the Households and the Environments Survey and I have sent him the link to the publication # 16-201-X Human Activity and the Environment: Annual Statistics. This publication lists a number of related surveys and publications.

If you can suggest any other surveys or sources that might address his research question (described in attached email), it would be helpful.


You student might also look at the National Private Vehicle Use Survey.

According to the abstract:
The specific objectives of the survey are:
- provide national estimates of annual fuel use for personal-use vehicles (includes
passenger cars/light trucks and vans);
- provide national estimates of total distance driven;
- identify the main factors in the purchase of a vehicle;
- identify in a general fashion how households use their vehicles;
- develop driver profiles by sex, age, marital status, income, education and occupational
- and develop vehicle profiles by vehicle make, model year, number of cylinders,
transmission type, and
presence or absence of air conditioning

GDP of the UK


I have a researcher who is looking for data for the last thirty years on the GDP of the UK as well as real interest rates over the same time period. I doubt that this will surprise those of you who know me but I can not, for the life of me, find this data. Can I ask for some assistance here?


Did you check the GDP Time Series Data at the UK National Statistics site? Seems to go back to about 1948.

Agriculture Question


I have a researcher looking for estimated areas, yields and productions of principal field crops, vegetable crops, fruit crops, and specialty crops in metric units by Census Division and Census Consolidated Subdivision from 1981 to 2000.

As he needs it annually, the Census of Agriculture won’t work and because of the geography, the CANSIM tables I found don’t cut it either.

Any thoughts or perhaps Agriculture Division could provide a custom tab?


Respondent #1: This may be a bit of a stretch even as a custom tab as the survey data will not support small area data for very many commodities. In some cases, the provincial agriculture statisticians take the STC sample data and do their own analysis.
Here is the Ontario site:

This one is quite good. There are similar sites in most provinces.


I have looked in each province and have found a rather uneven small area profile. The census does not carry yield and production but will have the area planted. The agriculture division may have small area data in unpublished form, but it may not be very clean. Our team can find that out for you.

Here is what I found out looking prov by prov
If it is not on their web site, they may still have it an an XL ss.




This may not go back far enough but they do have the data.

BC: not very good but it has the contact info



NB: limited

NS: does not go back very far but they may have it


NF-Lab limited

Respondent #2 : I had forwarded your question to our Agriculture Division and they confirmed that this information is not available. Here is their response:

"Unfortunately, for yields and specialty crops, the smallest level of geography that we have data for is crop district. For fruit and vegetable crops, we don't have any data other than at the province level."

Respondent #1: The reply from the Agriculture Division Their suggests that crop district level data are available for some doubt the major ones.  I suspect that this would be useful when trying to look at sub-provincial data.  The links that I sent earlier are probably all at the crop district level.  CCSD data (area) are only available during a census year and for non-census years, would have to be estimated using census data to dis-aggregate  crop district data.  Or the researcher would have to reformulate the research question to look at broader areas.

Friday, December 5, 2008

Survey of labour and income dynamics and Aboriginal Peoples or ethnicity

The Survey of Labour and Income Dynamics appears to collect information on ethnicity and Aboriginal Peoples, but this information does not seem to be in the PUMF. Could someone confirm if this is a confidentiality issue, and if so, would this aspect of SLID be available in the RDCs? Thanks in advance for any information on this matter.


Here is the response from Income Division

We have a few variables on ethnicity and Aboriginal peoples available from the Survey of Labour and Income Dynamics. Due to confidentiality reasons, these variables are not available in the PUMFs. They are available in the SLID data master file in the Statistics Canada Research Data Centers. Your client can consult the SLID Data Dictionary (75F0026-XWE) to obtain more information on these variables (for example, abortg15 and etho1c15). Please note that the sample size is small for these variables so there is limited data available for cross-tabulations.

Licensing Question

Note: Correspondence translated from the original French text


I am unable to access the "Licensing: Questions and Answers" section from the DLI site right now (like a lot of other pages elsewhere on the StatsCan site) where I believe I saw a similar question. My question is a nurse from a hospital affiliated with Laval University wants to access the data on the practice of smoking habits (1977, 1979, 1981, 1983, 1986). She is an professor affiliated with Laval University.


The professor can have access to the data, but others cannot because they're not affiliated with the university or are professors. The professor can access the data but cannot make them available to others who are not students, professors or members of the DLI.

Wednesday, December 3, 2008

StatCan Webview


I've been trying to create tables on the Nesstar interface for DLI data and get a page with: DLIB Access Control. This asks for username or password. Is this what is expected? Shouldn't you use the DLI institutions' IP ranges, etc. (same model as E-STAT)

Each Nesstar tabulation user is assigned a unique userid and password.

I will be sending you a userid and password in a separate e-mail.

GSS 1985 data coding error


A Grad student (and her Prof), are using the 1985 GSS and are reporting a coding error. The student is using the SDA platform, but I don't expect that makes a difference? The alleged error is as follows:

"Just wanted to let you know that a student has found a variable that has been improperly coded. IN the 1985 GSS the age variable, dvagegr, has two 30-34 age categories (one should be 25-39)."


I checked the spss and record layout on the dli ftp site,


the variable seems to be coded correctly.

01 "15-19 YEARS"
02 "20-24 YEARS"
03 "25-29 YEARS" <---
04 "30-34 YEARS" <---
05 "35-39 YEARS"
06 "40-44 YEARS"
07 "45-49 YEARS"
08 "50-54 YEARS"
09 "55-59 YEARS"
10 "60-64 YEARS"
11 "65-69 YEARS"
12 "70-74 YEARS"
13 "75-79 YEARS"