Monday, December 22, 2008

General Social Survey 21 - 2007 Release Date


A grad student here, at the University of Manitoba, has inquired about the General Social Survey Cycle 21. The DLI Product Release Dates page indicates that this was expected to be released in November 2008, but I don't see this listed on the DLI Browse the Collection on the Web page. Could you please let us know if there is a new projected release date for this survey? Thanks in advance for looking into this.


I just found out this morning that the GSS Cycle 21 won't be be available before spring 2009. They are not sure of the date. They were hoping to have it released before but they still have more work to do for the approval by the microdata release committee so it has delayed the released. We will probably have a better idea towards the end of January 2009 as when they will be able to release it.

We will change the date on the DLI Product Release Page.

Labour Force Survey (LFS) Variables Inquiry


I have learned that the PUMFs for the Labour Force Survey do not include "immigrant status", even though the question is now asked on the Survey.

Is there anything that could be done to have this variable added?

The immigrant status has never been included on the LFS PUMFs even if the question has been asked since January 2006.

As presented by Jason Gilmore at the Spring 2008 EAC meeting, as they are reviewing the LFS PUMF, this is one information they are considering to include in the LFS PUMF. This was strongly recommended by the EAC members and other people with whom they have consulted. I talked to Jason and he is not sure yet when the revision will be done. They are still working on alternative for the LFS PUMF content and will certainly be giving an update at the next EAC meeting in April 2009. Jason will keep us updated on any coming decision concerning the content of the LFS PUMF.

Confirmation of understanding on Aboriginal Identity and Education


I was just looking at Table 97-560-X2006028 (Aboriginal Identity (8), Highest Certificate, Diploma or Degree (14), Major Field of Study - Classification of Instructional Programs, 2000 (14), Area of Residence (6), Age Groups (10A) and Sex (3) for the Population 15 Years and Over) with a researcher and would like to make sure we have a clear understanding of what we are seeing.

For all of Canada, there are 25,664,220 people of Aboriginal and non-Aboriginal identity population, 15 years and over (excluding those in institutions) who were counted. Of those, there were 6,098,330 with no certificate, diploma, or degree and 19,565,900 with a certificate, diploma, or degree. There are 2,785,420 with an apprenticeship or trades certificate or diploma. So, those 2,785,420 would be included in the 19,565,900 figure and would not be included in the 6,098,330 figure.

Is our understanding correct?


Yes, you have the right interpretation.
If you click on the link you sent in your message then click on the variable Highest certificate, diploma or degree (14), the hierarchy of this variable will be obvious with the indentation of each category within that variable.

Wednesday, December 17, 2008

Agriculture-Population Linkage Data for Small Geographies


On Tuesday December 4th, The Daily includes an article about the Agriculture-population linkage data for the 2006 Census. Some of this data is available on the Census website, for larger levels of geography. A researcher at the University of Manitoba is interested in finding out if this data for small levels of geography (Consolidated Census Subdivisions) will be available through the DLI, and if so, about when can we expect
this be available? Thanks in advance for any information you might provide.


Data resulting from the Agriculture-Population linkage is and will only be available at the provincial and national levels.

Data from the Census of Agriculture as a whole - containing farm and farm operator data - is available to a low level (CCS level for free via our internet site). The Ag-Pop linkage takes only those farm operators who filled in the long census of population form (20% of the population) and applies weights to make it representative of the whole farm population. Because this database is based on a 20% sample of the entire farm population the confidentiality concerns and quality requirements does not permit distribution of
the data at sub-provincial levels.

Tuesday, December 16, 2008

CRA Tax Data


I have a faculty member here at University of Guelph who has traditionally worked with subprovincial tax data for Ontario. He's noticed that the CRA website is no longer providing the data at the subprovincial level nor are they keeping (on the web) data beyond a 5 year period. As you can imagine this is an issue for any researcher looking to conduct any historical analysis.

So - my questions are: did anyone in the DLI world collect and keep this data as it was released? And for the folks at DLI - I'm not sure if this is out of your mandate - but is there anyway for you to ask about the historical data? and subprovincial data?


We did check with CRA and this information is now available only through as custom tabulation from CRA under cost recovery. If you want to pursue this request, this is how you can proceed based on the response we got from CRA:

For income statistics, we have data going back to 1984. Again, most of these are text files, though some recent years are in Excel or PDF.

What years of data do you require? When you determine which years you wish to receive, you may send a request to Client Services at . They will assign a you a request number, and I can provide you with a cost estimate and delivery date.

Monday, December 15, 2008

Sex Orientation of Respondent Data from GSS18

I have a student looking for data of the Sexual Orientation of Respondent from GSS18. She is specifically looking for the data of question ?SOR_Q110 in Section 14 of the questionnaire. I checked the
codebook and there are three questions in the Sexual Orientation of Respondent Module, but somehow I can't find the any correspondent variables from the Data Dictionary. Are these questions used to derive
any variables? Or have they been suppressed somehow?

This variable has been suppressed from the PUMF because of confidentiality reasons as well as data quality reasons. This survey was one of the first ones to ask about sexual orientation and this resulted in low response rate for this question.

Information Technology in Higher Education


I have a grad student who's looking for data about the increased use of information and communications technology in post-secondary education, among students and instructors. I've found a some numbers from the U.S., but nothing specific to Canada. CIUS focuses on home users, and surveys like HERDS deal more with R&D spending.


I have consulted with the relevant divisions and it appears that Statistics Canada does not produce the type of data you mention below.

Wednesday, December 10, 2008

Policy on Codebooks for RDC Datasets


I am part of a team looking at the National Survey of the Work and Health of Nurses (NSWHN) survey. At our initial meeting at the RDC in Toronto, we were told that the codebook (containing the data dictionary with frequencies for all variables) could not be made available to us outside the RDC.

My question is: why not?

Other RDC-available surveys make data dictionaries freely available on the Web. Could someone tell me why an exception is being made for this survey? Even just a record layout, without any associated frequencies, would be useful. It would make life so much simpler, especially for a research group scattered clear
across the country.


I talked to the subject matter and it's not general practice to have codebooks and data dictionaries from master files available to DLI. Only a few subject matters did and only after ensuring that there is no confidential data released. The author division had no intention of making this codebook for the Health Nurses Survey outside of the RDC. They may go back and look at the codebook to see if they can release but at the present time, they don't have any resources to do so and don't know when they would be able to do so.

For most author divisions, they only release the codebook for PUMF which ensure that no confidential data are released. What is on our Web are the PUMF codebook and only a few exceptions have masterfiles codebook.

Tuesday, December 9, 2008

Looking for Educational Outcome Data


A student here is looking for data that will allow her to find educational outcomes for all members of a family, even those who are no longer living within the same household. She wants to test the theory that the highest level of education achieved by each child depends in part on the number of children within the family.

I’ve come up dry: do you know of any data that would allow her to do this study?

She has considered, and rejected, using the NLSY from the United States, since many of the children of the 1979 cohort have not yet completed their education. Would the Panel Study of Income Dynamics let her do this?


I just wanted to let you know that I did check with our Education Division on this one, just to be sure. They confirmed that the data are not available through them.

Monday, December 8, 2008

Education Statistics

I don't think I have overlooked anything obvious but you never know!

I have a faculty member who wants the number of degrees granted by field of study by province by gender for as recently as possible if not he can use Enrollment number by field of study by province by gender for as recently as possible.

CANSIM -477-0011 to 477-0014 does not have the field of study 81-229xib is only up to 2000 and does not have a detailed enough breakdown of field of study

On the CAUT website, in their Almanac, for 2003-2004 they have degrees by field of study by gender but only for the whole of Canada (Table 3.11) I need this by province!

CAUT gets its stats from the Centre for Education Statistics. Should I e-mail them or can DLI get them for us?? Alternatively how much would a custom tab be??

Any help greatly appreciated.

Please find the reply from the division below.

Please see my response below.

1) In CANSIM, the information is provided my field of study however the data is based on Classification of Instructional Programs (CIP) now and the primary categories are available.
This is available in Table 477-0014 University degrees, diplomas and certificates granted, by program level, Classification of Instructional Programs, Primary Grouping (CIP_PG) and sex, annual (number)

2) If further detail is required on the CIP, it is available for 2, 4 and 6 digit level from 1992 to 2004 on a cost recovery basis only. The following link provides information on the classification system used.

To provide a custom table by CIP, by province, by gender for 1 year of data would be $165 with each additional year costing $15.

Caution must be used however if a historical comparison is being done as it cannot be done at the 4 or 6 digit level. As about 40% of the institutions still report using the old system, data cannot be used at this detailed level for historical analysis. If historical analysis is done, it must not be done lower than 2 digit level.
2 digit CIP is still more detailed than CANSIM. It contains the following groups:

Field of study (2 digit) Code
Agriculture, Agriculture Operations and Related Sciences 01
Natural Resources and Conservation 03
Architecture and Related Services 04
Area, Ethnic, Cultural and Gender Studies 05
Communication, Journalism and Related Programs 09
Communications Technologies/Technicians and Support Services 10
Computer and Information Sciences and Support Services 11
Education 13
Engineering 14
Engineering Technologies/Technicians 15
Aboriginal and Foreign Languages, Literatures and Linguistics 16
Family and Consumer Sciences/Human Sciences 19
Legal Professions and Studies 22
English Language and Literature/Letters 23
Liberal Arts and Sciences, General Studies and Humanities 24
Library Science 25
Biological and Biomedical Sciences 26
Mathematics and Statistics 27
Military Technologies 29
Multidisciplinary/Interdisciplinary Studies 30
Parks, Recreation, Leisure and Fitness Studies 31
Basic Skills 32
Leisure and Recreational Activities 36
Personal Awareness and Self-improvement 37
Philosophy and Religious Studies 38
Theology and Religious Vocations 39
Physical Sciences 40
Psychology 42
Security and Protective Services 43
Public Administration and Social Service Professions 44
Social Sciences 45
Mechanic and Repair Technologies/Technicians 47
Precision Production 48
Transportation and Materials Moving 49
Visual and Performing Arts 50
Health Professions and Related Clinical Sciences 51
Business, Management, Marketing and Related Support Services 52
History 54
French Language and Literature/Letters 55
Dental, Medical and Veterinary Residency Programs 60
Other instructional program 89

If a custom table is required, turn-around time for delivery is 5-10 working days and pre-payment is required.

Data on driving, or car ownership and recycling or energy saving programs


This request for help came while I was at ACCOLEDS.I wonder if anyone has any ideas of StatCan sources that might be helpful for the Grad Student. (The student’s paper is due imminently.)

My thesis is the following: Is there a relationship between environmentally harmful behaviours (e.g. driving) and participation in energy savings programs? In other words, I am asking the question: Do people feel guilty for driving or other polluting activities and try to do something to make up for it?

Instead of energy saving programs, this could be replaced by "recycling programs." There is some helpful information in the SDA tool which I am comfortable with. I am trying to determine if there may be any other studies by Stats Can on areas relating driving, car ownership, recycling programs, or energy saving programs (energy audits), etc.


With respect to my previous email, I expect the student is using the Households and the Environments Survey and I have sent him the link to the publication # 16-201-X Human Activity and the Environment: Annual Statistics. This publication lists a number of related surveys and publications.

If you can suggest any other surveys or sources that might address his research question (described in attached email), it would be helpful.


You student might also look at the National Private Vehicle Use Survey.

According to the abstract:
The specific objectives of the survey are:
- provide national estimates of annual fuel use for personal-use vehicles (includes
passenger cars/light trucks and vans);
- provide national estimates of total distance driven;
- identify the main factors in the purchase of a vehicle;
- identify in a general fashion how households use their vehicles;
- develop driver profiles by sex, age, marital status, income, education and occupational
- and develop vehicle profiles by vehicle make, model year, number of cylinders,
transmission type, and
presence or absence of air conditioning

GDP of the UK


I have a researcher who is looking for data for the last thirty years on the GDP of the UK as well as real interest rates over the same time period. I doubt that this will surprise those of you who know me but I can not, for the life of me, find this data. Can I ask for some assistance here?


Did you check the GDP Time Series Data at the UK National Statistics site? Seems to go back to about 1948.

Agriculture Question


I have a researcher looking for estimated areas, yields and productions of principal field crops, vegetable crops, fruit crops, and specialty crops in metric units by Census Division and Census Consolidated Subdivision from 1981 to 2000.

As he needs it annually, the Census of Agriculture won’t work and because of the geography, the CANSIM tables I found don’t cut it either.

Any thoughts or perhaps Agriculture Division could provide a custom tab?


Respondent #1: This may be a bit of a stretch even as a custom tab as the survey data will not support small area data for very many commodities. In some cases, the provincial agriculture statisticians take the STC sample data and do their own analysis.
Here is the Ontario site:

This one is quite good. There are similar sites in most provinces.


I have looked in each province and have found a rather uneven small area profile. The census does not carry yield and production but will have the area planted. The agriculture division may have small area data in unpublished form, but it may not be very clean. Our team can find that out for you.

Here is what I found out looking prov by prov
If it is not on their web site, they may still have it an an XL ss.




This may not go back far enough but they do have the data.

BC: not very good but it has the contact info



NB: limited

NS: does not go back very far but they may have it


NF-Lab limited

Respondent #2 : I had forwarded your question to our Agriculture Division and they confirmed that this information is not available. Here is their response:

"Unfortunately, for yields and specialty crops, the smallest level of geography that we have data for is crop district. For fruit and vegetable crops, we don't have any data other than at the province level."

Respondent #1: The reply from the Agriculture Division Their suggests that crop district level data are available for some doubt the major ones.  I suspect that this would be useful when trying to look at sub-provincial data.  The links that I sent earlier are probably all at the crop district level.  CCSD data (area) are only available during a census year and for non-census years, would have to be estimated using census data to dis-aggregate  crop district data.  Or the researcher would have to reformulate the research question to look at broader areas.

Friday, December 5, 2008

Survey of labour and income dynamics and Aboriginal Peoples or ethnicity

The Survey of Labour and Income Dynamics appears to collect information on ethnicity and Aboriginal Peoples, but this information does not seem to be in the PUMF. Could someone confirm if this is a confidentiality issue, and if so, would this aspect of SLID be available in the RDCs? Thanks in advance for any information on this matter.


Here is the response from Income Division

We have a few variables on ethnicity and Aboriginal peoples available from the Survey of Labour and Income Dynamics. Due to confidentiality reasons, these variables are not available in the PUMFs. They are available in the SLID data master file in the Statistics Canada Research Data Centers. Your client can consult the SLID Data Dictionary (75F0026-XWE) to obtain more information on these variables (for example, abortg15 and etho1c15). Please note that the sample size is small for these variables so there is limited data available for cross-tabulations.

Licensing Question

Note: Correspondence translated from the original French text


I am unable to access the "Licensing: Questions and Answers" section from the DLI site right now (like a lot of other pages elsewhere on the StatsCan site) where I believe I saw a similar question. My question is a nurse from a hospital affiliated with Laval University wants to access the data on the practice of smoking habits (1977, 1979, 1981, 1983, 1986). She is an professor affiliated with Laval University.


The professor can have access to the data, but others cannot because they're not affiliated with the university or are professors. The professor can access the data but cannot make them available to others who are not students, professors or members of the DLI.

Wednesday, December 3, 2008

StatCan Webview


I've been trying to create tables on the Nesstar interface for DLI data and get a page with: DLIB Access Control. This asks for username or password. Is this what is expected? Shouldn't you use the DLI institutions' IP ranges, etc. (same model as E-STAT)

Each Nesstar tabulation user is assigned a unique userid and password.

I will be sending you a userid and password in a separate e-mail.

GSS 1985 data coding error


A Grad student (and her Prof), are using the 1985 GSS and are reporting a coding error. The student is using the SDA platform, but I don't expect that makes a difference? The alleged error is as follows:

"Just wanted to let you know that a student has found a variable that has been improperly coded. IN the 1985 GSS the age variable, dvagegr, has two 30-34 age categories (one should be 25-39)."


I checked the spss and record layout on the dli ftp site,


the variable seems to be coded correctly.

01 "15-19 YEARS"
02 "20-24 YEARS"
03 "25-29 YEARS" <---
04 "30-34 YEARS" <---
05 "35-39 YEARS"
06 "40-44 YEARS"
07 "45-49 YEARS"
08 "50-54 YEARS"
09 "55-59 YEARS"
10 "60-64 YEARS"
11 "65-69 YEARS"
12 "70-74 YEARS"
13 "75-79 YEARS"

Friday, November 28, 2008

Vehicle Kilometres Question


I am dealing with another research request for motor vehicle information, this time for recent vehicle-kilometres data. There is a table showing estimates by province in the Canadian Vehicle Survey: Annual, 53-223-X.

The researcher is looking for estimates of vehicle-kilometres driven in three major metropolitan areas, Montreal, Toronto and Vancouver. Is there any hope or possibility of getting something at this level?


Transportation Division can provide this data for a fee. Please let me know which years you need so they can provide you with a cost estimate.

Need Aboriginal Population Profile From 1991 and 1996


Is it possible to obtain the Aboriginal Population Profile for Canada, Provinces and Territories, Census Divisions and Census Subdivisions for both the 1996 and 1991 Censuses?

We were successful with this same request sometime ago when we were needing the 2001 data and are keeping our fingers crossed about this request.


According to my Census consultant, for the 1991 and 1996 censuses there no "cumulative profiles" as there are now for the aboriginal population. However, she has suggested the following print and CD-ROM products instead:

94-325 Profile of Canada's Aboriginal Population
This publication presents a statistical overview of each Aboriginal group in comparison with the non-Aboriginal population. A wide range of demographic and socio-economic variables is displayed by individual entries and grouped under six main headings. This publication also provides a profile of
demographic and socio-economic characteristics for the population with Aboriginal origins and/or Indian Registration, for Canada, provinces and territories. It is based on 20% sample data collected by the 1991 Census of Canada.

94-326 Canada's Aboriginal Population by Census Subdivision
This publication gives the population of each Aboriginal group by registration and band membership status.

94-327 Aboriginal Data - Age and Sex
This publication provides age and sex distributions for the 1991 population reporting Aboriginal origin, and the 1991 population identifying with their Aboriginal origins and/or who are registered under the Indian Act, for Canada, the provinces and territories, and selected census metropolitan areas. The data presented in this publication are taken from two sources: the 1991 Census and the 1991 Aboriginal Peoples Survey (APS).

These products were only available in print format and should be available in your library.

However, for 1996 there was a CD-ROM product:

94F0011XCB1996000 Portrait of Aboriginal Population in Canada, 1996 Census (20% Sample Data)
This CD-ROM provides a portrait of aboriginal population in Canada. This product is part of the Dimensions Series which provides census statistical information on topics of public interest.

It is available from the DLI FTP site:

For data at lower levels of geography you will need to investigate the Basic Summary Tabulations for those years or you may need to consider a semi-custom or custom tabulation. If you want to pursue the latter option, please let me know.

CCHS 4.1 and patient satisfaction and utilzation questions by health region


My question: Will a user be able to obtain data on patient satisfaction and utilization questions by health region from CCHS 4.1?
We are hopeful because the sample size has grown to 65,000 (this was a problem two years ago).

See: Sampling This is a sample survey with a cross-sectional design.

To provide reliable estimates to the 121 health regions (HRs), a sample of 65,000 respondents is required on an annual basis. A multi-stage sample allocation strategy gives relatively equal importance to the HRs and the provinces. In the first step, the sample is allocated among the provinces according to the size of their respective populations and the number of HRs they contained. Each province's sample is then allocated among its HRs proportionally to the square root of the population in each HR. (excerpt from:

However, after looking at the questionnaire we are cautiously optimistic:
Optional Content section, specifically page 268, Appendix 2, the patient satisfaction topics are only covered off by a few jurisdictions (at least according to this table).

CCHS 3.1 provided patient satisfaction data by province, and every province participated which may be why there is confusion now.


Here is the response from the author division regarding your question on patient satisfaction and utilization questions by health region from CCHS 2007:

"Patient satisfaction (PAS): Was asked out of a sub-sample of respondents in the 10 provinces. Selected as optional by Yukon and the Northwest Territories. Therefore, only provincial/territorial estimates are possible. However, this module was also selected as optional module by Ontario and Saskatchewan. Health region (HR) level estimates are possible for these 2 provinces only, where sample size allows.

Utilisation: if you are referring to the Access to health care services module (ACC): this was also asked as a subsample to the 10 provinces only. Not asked and not selected by none of the Territories. Selected as optional module by New-Brunswick and Ontario. HR level estimates possible for these 2 provinces only, where sample size allows.

Utilisation: if you are referring to the Health Care Utilization module (HCU): This module is part of Core content and therefore asked of everybody. HR level estimates are then possible, of course where sample size allows."

I hope this answers your question. Please note that CCHS 2007 is effectively CCHS 4.1, but they have dropped the Cycle number from the name in favour of the year.

Synthetic Files for the CCHS


This question is in regards to the new synthetic files announced for the CCHS:

Canadian Community Health Survey 2007 - Synthetic Files

The central objective of the Canadian Community Health Survey (CCHS) is to gather health-related data at sub-provincial levels of geography (health region or combined health regions).

Note: It is important to note that these synthethic files do not contain real data and should never be used for analytical purposes. Their only purpose is to assist users in developing and testing the computer programs that are to be submitted by remote job submission.

FTP: /dli/cchs/Synthetic_files-Dummy_files/

I am confused by this latest synthetic file for the CCHS. Other synthetic files were identified by an associated Cycle. What is the CCHS 2007 synthetic file all about?

Also, I think the web link for this 2007 synthetic file may inadvertently be conjoined with the web link for Cycle 1.1?

I had asked Health Division about the Cycle number when they sent us the data, and they provided the following explanation:
"Effective starting this cycle, the CCHS team made a decision to drop the .1 in the naming convention and simply state the year- (because CCHS has gone to continuous collection, the term CCHS 4.1 is not quite accurate anymore). Therefore, the CD is correctly named CCHS 2007."
Also, you are correct about the web link problem. There is an error in the web link for Cycle 1.1 on our English site (the links on the French site are correct). I will have our team fix that today.

Thanks for letting us know.

2001 Census Question


A health researcher at UBC has found everything he needs from the 2001 census except for one thing, the age structure of the total aboriginal identity population by forward sortation areas. I too have looked and have not been able to find a table with this level of detail.

Is it possible to confirm that that is the case? If so, are the data available through a custom tabulation?


I agree with you that the only way you can get this info is through a custom tabulation. In order to obtain that tabulation, you may contact the Vancouver Regional Office.

Wednesday, November 26, 2008



We currently encountered problems again in accessing the CANSIM (E-STAT) tables by subject. We got the following message:

Error opening the file: F:\WWW\ROOT\CII\CII_SUBJ.TPF
Could you please check the source of the problem?


The manager of E-STAT informed me that the redirects for some outdated links on the E-STAT site are not working at the moment. To ensure that E-STAT functions properly, she suggests accessing it from the following URLs:

in English:
in French:

DLI Website Not Available


The DLI website does not seem to be available in the French and in the
English version.


These two URLs will work if you take out ".gc"

We will send out a general notice on URL in the near future.


Official Announcement

On November 24, 2008, the Data Liberation Initiative web pages were successfully converted to comply with Treasury Board's Common Look and Feel (CLF) 2.0 guidelines. However, there are still some modifications that need to be done to the page URLs to make them fully compliant with CLF 2.0.

Therefore, we ask that you refrain from making any changes to your bookmarks and links to the DLI web pages at this time. The URLs that were in place prior to November 24 still work and will work for the next two years at a minimum thanks to redirection.

I will let you know as soon as the URLs have been modified to satisfy CLF 2.0 requirements.

We apologize for the inconvenience and thank you for your cooperation.

Survey of Innovation 2005 and Biotechnology Use and Development Survey 2007


Will there be a PUMF for the 2005 Survey of Innovation, as there were for 1999 and 2003? In addition, will there be a PUMF for the 2007 Biotechnology Use and Development Survey?

I noticed that all related CANSIM tables for these surveys have been terminated (with statistics up to 2005). Are there plans for any new/ongoing CANSIM tables for these two surveys?


I received the following response from the author division of both surveys (Please note that the IMDB section the Statistics Canada website incorrectly provides a link to a "PUMF" for the 1999 and 2003 Survey of Innovation. However, there is no PUMF and the link actually takes you to Excel data tables. I will ask my contact at the IMDB to correct that as soon as possible):

"We do not have a PUMF for the Survey of Innovation 2005 nor have we ever had a PUMF for the Survey of Innovation 1999 or 2003. What we do have is a researcher database that is accessible through the facilitated access program.

Our surveys are occasional. When CANSIM indicates that a series has been terminated what this means is that we will not be producing a time series of data. For example, we will not re-run the Survey of Innovation 2005 to update the data. We are in the planning stages for a 2008 Survey of Innovation. Some of the questions may be the same as they were in 2005 but many will be different. As well, the sample unit will be different. So in answer to the question there will be no more CANSIM tables for the Survey of Innovation 2005.

With respect to the Biotech Survey Program - at present there is no funding for a survey for 2007 and therefore no plans for any data. We continue to speak to interested parties, but it appears they are either not able or willing to provide the funds necessary to maintain this survey. This could change at any point, but it appears less likely as time passes. Obviously if there is no survey, there will be no updates to the CANSIM tables.

To the best of my knowledge, there has never been a PUMF for Biotech. The population is very small and there are quantitative variables which are sensitive. The data can only be accessed through the Facilitated Access Research Program run by the division. This program is limited to accredited researchers (PhDs
at Canadian universities or government departments) with an approved econometric research project and security clearance."

I hope this answers your question. Please let me know if you need any further clarification.

Monday, November 24, 2008

NLSCY Question

Is my understanding of the PUMFs available for the NLSCY correct?

1. PUMF's available for cycles 1-3 only

2. data for cycles 4-6 only available through RDCs.

You are correct in saying that the DLI offers NLSCY PUMFs for Cycles 1-3
only. We also offer synthetic files for Cycles 1 to 6.

I just received confirmation that the Research Data Centres have Cycles 1-6.
They expect to have Cycle 7 as well before the end of this year.

I hope this answers your question.

Aboriginal Educational Attainment


I’m looking at the figure for educational attainment of the Total Aboriginal identity population, 15 years and over, for the province of Ontario.

If I look at the following topic-based tabulation, the figure for ‘University certificate, diploma or degree’ is 16,480:

Aboriginal Identity (8), Highest Certificate, Diploma or Degree (14), Major Field of Study - Classification of Instructional Programs, 2000 (14), Area of Residence (6), Age Groups (10A) and Sex (3) for the Population 15 Years and Over Data products — Aboriginal Identity (8), Highest Certificate, Diploma or Degree (14), Major Field of Study - Classification of Instructional Programs, 2000 (14), Area of Residence (6), Age Groups (10A) and Sex (3) for the Population 15 Years for Canada, Provinces and Territories (97-560-X2006028)

However, if you look at the same data in the Aboriginal Population Profile 2006, the figure for ‘University, certificate diploma or degree’ is 12,435, which is the figure in the table above for ‘university certificate or degree’. The figures for all the other variables in the ‘Highest certificate, diploma or degree’ dimension match. Unless I’ve missed something, it looks like the label in the Aboriginal Population Profile 2006 is incorrect. If that is so, can this be corrected?


Our Census consultant has confirmed that the label "University, certificate diploma or degree" in the Aboriginal Population Profile 2006 is incorrect. It will be changed to "University certificate or degree."

Thanks very much for spotting that.

LFS and Immigrants

A question from a user:

"In 2006, the LFS started collecting information on immigrants. This information is not available through the DLI file. Are there any plans to include the 5 questions on immigrants in the PUMF? As the LFS is not available in RDCs, is data only available through existing publications and custom tabulations?"

I have informed her that in fact the LFS microdata _are_ available in the RDCs. But what about the answer to the question about the availability of these variables in the LFS pumf?

Our LFS consultant said that they are planning to add more variables and geography (i.e. CMAs) to the LFS PUMFs. Pending approval, they hope to add these sometime in 2009-2010. This is consistent with information provided to us at the April 2008 EAC meeting.

Follow-up Question
Will the geography include CAs as well as CMAs?

Follow-up Answer
The LFS consultant has confirmed that they only provinces and selected CMA's will be made available.

Notes about SHS 2006


1) The Data Dictionary (shs2006cbk.pdf) presents conflicting information for two variables (NMVEHONP and VEHLEASP, page 36).

The Readme file indicates that the effective date for household equipment counts was the interview date. Would you confirm that the "Long name" information is inaccurate? If it is inaccurate, the variable labels used in SPSS and SAS are also in error.

Long name: Number of vehicles owned on December 31
Description: This variable gives the number of vehicles (car, van/mini-van, truck/sport utility vehicle) owned by members of the household at time of interview completely or partially for private use, excluding those leased.

Long name: Vehicles leased on December 31
Description: This variable gives the number of vehicles (car, van/mini-van, truck/sport utility vehicle) leased by members of the household at time of interview completely or partially for private use.

I'd also point out that since the responses to VEHLEASP are "yes" and "no", the "Description inaccurately describes the variable: it's not the "number of vehicles" leased but instead is a code to reflect the presence of leased vehicles (as indicated in the "Unit of Measure").

2) I was looking at variable NETCONEP, and saw that there was no "missing" category for the variable. Instead, people who weren't asked the question (those who did not use the internet from home, from variable INTERNET) are collapsed into the NETCONEP category of "0 -- No internet connection". I think that it would be preferable to explicitly separate those who weren't asked the question from those who responded that they didn't have an internet connection. It's unlikely, but possible, that respondents have a home internet connection, but do not use it: this possibility is eliminated by the current coding, which doesn't accurately reflect the questionnaire.

I have discussed your comments and questions about the SHS 2006 with the author division. Here are the responses they have provided.

1. "The correct date is December 31 of the reference year for vehicles. And the vehleasp does indicate the presence of at least one leased vehicle on Dec 31, not the number. The definitions should be modified."

2. "If the respondant says no to internet use from home we do not ask any further questions on internet access and by default assign "no access". This may not be strictly true in some rare cases, but we have to minimise questions that the respondants consider a waste of time when they have already told us "no"."
The author division will be correcting the documentation to reflect the correct reference date for vehicles. I have asked them to provide us with the corrected versions as soon as they are available.

For the NETCONEP variable, I will ask our SPSS coder to change the status of the "0 -- No internet connection" code to "missing".

Canadian Travel Survey - updates?

I was wondering whether there is or will be an update to the CTS? We only have till 2004 - but it looks like 2006 is the most recent release?

In early 2005, the Canadian Travel Survey (CTS) was replaced by the Travel Survey of Residents of Canada (TSRC):

Q1. The online catalogue does refer to a 2006 PUMF for the CTS. I will contact the author division to see if this is incorrect.

Q2. In either case, I will ask them to send us an updated PUMF for either the CTS or the TSRC and I will keep you posted.

Answer to:

Q1 It is a mistake. They should take off this link.
Q2 We will prepare a 2006 CD-Rom this fall for DLI.

Follow-Up Question
Thanks for the info - I'm assuming that 2005 will be included as well?
The first PUMF for TSRC is 2006.
(This will be sent to DLI sometime in Nov.)

No public microdata files is foreseen for TSRC 2005

However, the 2007 microdata files will be ready before the end of 2008.

Aboriginal Population Stats


I have a faculty member who's looking for Aboriginal (including Metis, Inuit, etc.) population stats by age characteristics from the 2006 Census, The catch is that he'd like it in 1-year age increments rather than the standard 5-year increment. I suspect he's looking at either a custom tabulation or an applying to use the nearest RDC, but I thought I'd ask the group first.

Our Census consultant has confirmed that population statistics showing single years of age for aboriginals would be a custom data retrieval.
Let me know if you would like to order it, and I will ask one of our account executives to get in touch with you.

Friday, November 21, 2008


I have a researcher who is VERY interested in obtaining some custom tabulations from both the UCASS and Survey of Earned Doctorates. How would they go about doing this?

Also - is there any historical data available through DLI for the SED?

For custom tabulations, I can have an account executive contact you or your
researchers directly. Would you like me to arrange that?

As for the SED, there is data for 2003/4 and 2004/5 available under the
"Education" folder on the DLI FTP site and website. However, according to
the Daily there was a release of 2002/3 data as well. I will ask our
Education consultant whether they can provide us with that data and keep you

Weighting in GSS Cycles

I have a faculty member asking this question regarding GSS cycles and weighting. Your help will be greatly appreciated.

I'm using cycles 1, 9 and 18 of the General Social Survey. Do I have to incorporate the weights for the cases to find correlations and do regression analyses? For example, in cycle 9 the weight variables PERWGHT.

Yes, you should always use the weighted data to do this type of calculations and you may want to use the program bootvar to calculate the variance.

Hope this helps.

Agriculture-Population Linkage Database Access

1) Does the DLI community have access to the Agriculture-Population Linkage Database as described at:

How about through the RDC's?

2) My student really needs access to the raw data, the published tables are not adequate for her research on "farm women" -
different from female farm operators.


1) Data from the 2001 Agriculture-Population Linkage Database are available for free on the Statistics Canada website:

The data found at the above-mentioned link are the only data mentioned in the Daily release for this product: However, I will check with Census Division to see if any additional data could be obtained through custom tabulations or other means and keep you posted.

2) Our Agriculture consultant confirmed that 2001 Agriculture-Population Linkage data are only available (1) via the web at the link already provided below, or (2) via custom tabulations.

He also mentioned that 2006 Agriculture-Population Linkage data will be available on the Statistics Canada website on December 2, 2008.

Please let me know if your student is interested in a custom tabulation.

Older versions of the Postal Code Conversion File


I’m writing to see about getting older versions of the PCCF (Postal Code Conversion File). The DLI FTP site has files from 2007 to March of 2008, but nothing older. I have locally stored files from earlier years, but lack 2004 through 2006. A researcher here needs a version of the PCCF from this period. Are these older files still available? Is there a reason why we don’t have these older files on the DLI FTP site?

Older versions of the PCCF are available on our FTP site. The PCCFs for 2002 to 2006 are available in the 2001 folder:


More PCCF files can be found in the 1986, 1991, and 1996 folders at: /ftp/dli/geography

GSS-17 Ethnic Question


The GSS17 questionnaire has questions on ethnicity, but we are not finding a corresponding variable in the PUMF. The user guide does not address the issue of missing data, and I cannot seem to locate the user guide for the master file.

Could you ask the great folks at GSS where ethnicity is hiding in GSS17?


The GSS team has provided the following explanation for the missing ethnicity variables in the Cycle 17 PUMF:

"All the ethnic variables were suppressed the cycle 17 PUMF, in order to protect the confidentiality of respondents. This was done to all GSS PUMF that could have contained ethnic variables, except for cycle 20 that only has 7 categories of ethnic responses (those are very limited types of answers). This exercise of variable suppression is necessary to avoid the possibility of identifying respondents, based on certain characteristics that could be used to identify someone, such as their ethnic origin. The ethnic variables for cycle 17 are only available in the analytical file, available in the RDCs."

CANSIM Table 180-0003


I have a question regarding CANSIM TABLE 180-0003 (Financial and taxation statistics for enterprises, by North American Industry Classification System (NAICS), annual …. ). This table presents information aggregated for all of CANADA. A researcher is inquiring whether the information on CANSIM Table 180-0003, is also available, via the DLI, (or elsewhere), for individual provinces.

(I did take a look at the CDROM Financial Performance Indicators for Canadian Business, and to the extent I was able to understand what was being presented, it seems this product presents ratios, not amounts. Volume 3, presents provincial level information, but seemingly only for small corporations. So I am assuming this DCL CDROM is not likely to be a source of the required information? )

Sorry our provincial data is not available on the Statistics Canada website. This data is not readily available so programs must be run in order to compile the data. For this reason, the data is compiled under a cost recovery program. If you would like further information on our cost recovery program and what might be available for purchase, please reply by return email.

TLAC Preliminary Tables


As I have a user (a library administrator) anxiously waiting for the TLAC 2008-2009 file to be available through DLI, I just had a quick look at the FTP site to see if something had been posted. The tuition-data(tlac).zip file was apparently updated on November 14 so I looked at the content and saw that a new file appeared which is called: tlac-fssuc tables-tableaux 1 to-à 6 - 2008-2009 prelim.xls (created or last modified also on November 14)

So I figured this would present preliminary data for the tables 1 to 6. Is that correct?

In any case I tried to open the file but it appears to be corrupted, it cannot be properly unzipped.


Yes - the data for 2008-2009 that you found is the preliminary data. We are supposed to be getting some additional preliminary files this week so we were going to announce them once we had all the files. You are welcome to use the preliminary data that is there now. Our team fixed them so you should be able to unzip them now. Please try again and let me know if they are still corrupted.

FYI - I just received the additional TLAC preliminary files for 2008-09. I will have our team post them right away and will announce them shortly.

Follow-Up Question
Can you tell me what difference there is between the preliminary files and the final ones? But maybe this explanation will be part of your official announcement later on, in which case ignore this question.

By the way, the files are fine now; I was able to unzip them without a hitch.

Do these files include Table 7?

Follow-up Answer
1) It does (or rather did) include Table 7, but I just received an email from the author division asking me to remove these files from our site because they have found errors in them.

So, I will be sending out an announcement to that effect in a few moments.

2)The author division has explained that the TLAC preliminary tables include some estimates, whereas the final tables do not. Here is the full explanation provided by the division:

"On the questionnaire, under General Instruction, the first instruction is:

Whenever possible, final fees and living accommodation costs should be reported. If they have not yet been determined your best estimate should be reported. Place an "e" after each estimated figure on the questionnaire.

This means for 2008-2009 the data includes final and estimated fees. For 2007-2008 it is all final data since in the questionnaire it was asked to report the actual fees for the previous year. This is the explanation of why for 2007-2008 we have final tables while it is preliminary for 2008-2009. After next year's survey, 2008-2009 tables will be final and 2009-2010 will be preliminary."

Wednesday, November 19, 2008

2006 Census of Agriculture


A researcher at York has asked whether it is possible to find the total hectares and types of production systems for farms in Toronto, separate from the CCS of Vaughan, Ontario.

Is it possible to get this kind of detail, for free or for fee?


I suspect you may have used the Agricultural Community Profiles to find this data. Have you also looked at the Land Use data tables on the web at: ?

Excel tables from the Census of Agriculture are also available on our FTP

TABLE: ceag_farm_data-reag_donnees_sur_les_exploitations.csv

The Reference Documents that will allow you to decipher these tables are available in the following directory:


For example, in the reference document
"2006_ceag_farm_data-variable_descriptionv2.xls", you will notice variables
such as the following:

VAL_AOWNED NUMBER Total area owned - Acres Acres

In the 2006_ceag_geography.xls document, you will be able to identify whether the CCS of interest are available in the actual table.

Follow-up Question

Thank you for all of your suggestions.  Unfortunately, the data in both the Agricultural Community Profiles and Excel tables from the ftp site only refers to the CCS of Vaughan, Ontario.  My user would like data for
Toronto, which has been amalgamated into this CCS.  Is this data available, either for free or fee?

Follow-up Answer

Agriculture Division confirmed that data for Toronto would be available through a custom request:

"The cost is $580 plus taxes for the following:
All farm and operator data for Toronto Division, 2006 Census"

Ethnic Diversity Survey


A faculty member here is looking at using the Ethnic Diversity Survey. We couldn’t find a “number of children ever born” variable - is there any measure of fertility other than number of children in the household (which obviously doesn’t capture children who have moved out)?


The author division of the Ethnic Diversity Survey has confirmed that this survey does not measure the number of children who live outside of the household.

Messengers in Toronto - Info needed


I have a faculty member looking for the languages spoken (preferred) or ethnicity of messengers working in the City of Toronto (or even CMA)

I think the census is going to be the only resource that has this The relevant NAICS is 49221 or 492210 and the relevant NOC-S is B563.

Census 2006 tables that go to the level of detail for NAICS and NOC-S do not have detailed data on language or ethnicity.

I wonder if Census could supply this info or if not whether it would be possible to get a custom tab?

Thanks for any help and suggestions of other sources.


Our Census Consultant has confirmed that this would be a custom tabulation - specifically either Industry or Occupation codes crossed by the pertinent language or ethnic origin variable. However, the noted the following:

"the counts I see for NAICS and NOC-S are at the CMA level, and a custom consultant may indicate that the counts are not sufficient enough to produce any valid information for any detailed crossing of the variable."

NPHS Cycle 7 Synthetic Files


Is there any information as to when the NPHS cycle 7 synthetic files may be released to DLI?


The NPHS team has indicated that they are hoping to get us the NPHS 7 Cycle 7 synthetic files towards the end of December.

Ontario Wage Survey 1999


I have a prof here at Guelph looking for the 1999 Ontario Wage Survey. I found a reference to it in the Daily - December 1991 - here's the link:

Is this available through DLI? The link that is provided in the Daily is no longer working.


1) Try this:

2) I found the following info about the 1999 Ontario Wage Survey

3) This looks like a business based survey that was done by the Small Business and Special Surveys Division. Usually business based surveys do not produce a public use file. They may have produces some tables or a report. I suspect that the data may be available only through custom tabulations - for a fee of course.

Friday, November 14, 2008

2006 Journey to Work data


A faculty member at McMaster needs 2006 Journey to Work data for Moncton-305, Saint John-310 and Saskatoon-725. I have looked into 2006 JTW custom tables that were made available through DLI but I could not find data for these 3 areas. Is there any place else where I should be looking for this data? Has the JTW data been produced for these areas? Any information related to this matter will be greatly appreciated. Thanks in advance for any help!

Did you try the following tables? They are available on the FTP at: /dli/census/2006/2006_pow_consortium/final_2006_ct_flows/canada

Commuting Flow for Census Metropolitan Areas, Census Agglomerations and Census Tracts: Mode of Transportation (9) and Sex (3) for the Employed Labour Force 15 Years and Over Having a Usual Place of Work, 2006 Census - 20% Sample Data

Catalogue number 97C0088

Commuting Flow for Census Metropolitan Areas, Census Agglomerations and Census Tracts: Work Activity (4) and Sex (3) for the Employed Labour Force 15 Years and Over Having a Usual Place of Work, 2006 Census - 20% Sample Data

Catalogue number 97C0089

Depending on what you are looking for specifically, the tables in the following directory may help:


I also found these tables in the topic-based tabulations section: /dli/census/2006/Topic-based-tabulations/place-of-work-and-commuting-to-work/b2020

Place of Work Status (5), Age Groups (9) and Sex (3) for the Employed
Labour Force 15 Years and Over Canada, Provinces, Territories, Census
Metropolitan Areas and Census Agglomerations - Cat. No. 97-561-X2006006

Commuting Distance (km) (9), Age Groups (9) and Sex (3) for the Employed
Labour Force 15 Years and Over Having a Usual Place of Work Canada,
Provinces, Territories, Census Metropolitan Areas and Census
Agglomerations - Cat. No. 97-561-X2006010

Mode of Transportation (9), Age Groups (9) and Sex (3) for the Employed
Labour Force 15 Years and Over Having a Usual Place of Work or No Fixed
Workplace Address Canada, Provinces, Territories, Census Metropolitan
Areas and Census Agglomerations - Cat. No. 97-561-X2006012

Obtaining a License for Census Data


A researcher here is collaborating with the City of Ottawa on a research project and would like to obtain a license for some Census data. Could you please provide me with a contact for her?

Please have your researcher contact Licensing Services. Their contact details can be found on the following page of our website:

Wednesday, November 12, 2008

PUMF for 2005 International Survey of Reading Skills (ISRS)


I would like to inquire whether there will there be a PUMF for the International Survey of Reading Skills (ISRS) – 2005 /Enquête internationale sur les compétences en lecture (EICL) -2005, as noted in The Daily last January :


1) I just received confirmation from the author division that there will be no PUMF for the ISRS 2005 survey.

2) The author division has provided me with some additional information on this topic. Apparently, the reason there will be no PUMF for the ISRS 2005 is because the sample size is too small.

Information on Visa Students


I received the following request: "I am interested in gaining access to Stats Can info through the Data Liberation Initiative, especially as concerns statistics for international education. Can you please tell me how I may search/access this data?

For example, we are currently in need of the most up-to-date stats on country of origin for visa students in Canada and how many of those come from the U.S."

I thought this would be fairly straightforward; not quite. I found an AUCC publication on Enrolment but the data go only to 2004. And I have looked all through PSIS and found several tables, but none that fit the bill. There's a Daily article from Feb 2008 with 2005-06 data, but they aren't detailed enough.

Any other thoughts? I called our planning folks and they actually provide data to PSIS so we only have our own here taken from our admin system.


1) It's not Stats Can that collects that info, but rather Citizenship and Immigration. For example, the 2006 Facts and figures at: has a bunch of tables on foreign students, including the top 10 countries of origin.

Before 2003, they were reported in:

Facts and figures [yyyy]: statistical overview of the temporary resident and refugee claimant population.
I am not sure whether or not LAC has those on its web site, but I have copies from 1999-2002 (pdf files), if you need them.

2) I have found statistics on foreign or international students enrolled in Canadian universities in OECD sources. Look in SourceOECD under OECD databases: OECD.Stat (2004-2006) or Education Statistics, where one of the databases is called Foreign Students Enrolled (1998-2003).

3) Our Education consultant told me that data on country of origin for visa students in Canada is available as a custom tabulation:

"This would be available from a custom extraction from PSIS. The cost would be $160 for one year and $10 for each additional year added at the same time. Turn-around time for delivery is 5-10 working days and pre-payment by credit card would be required."

Monday, November 10, 2008

Winter Insitute on Statistical Literacy for Librarians 2009

The University of Alberta Libraries will be hosting its third Winter Institute on Statistical Literacy for Librarians from February 18-20, 2009. This training event will provide strategies and skills for finding, evaluating and retrieving published statistics and will be useful to information professionals working in academic, public and special libraries.

This workshop will not provide instruction about how to do data analysis, although some examples will include the use of data to demonstrate how statistics are produced.

The instructional focus is on making digitally published statistics more accessible as information to librarians and their patrons. The topics to be covered include:

- A framework for thinking about published statistics
- How statistics and data are related
- Statistical definitions, standards and classifications
- Metadata for statistical displays
- How official and non-official statistics are produced
- Evaluating statistics and statistical sources
- Tools and strategies for locating statistics
- Citation standards for statistics
- Geographical and spatial representation of statistics
- Addressing the challenge of finding statistics for small geographic areas

The conference is restricted to 30 participants on a first-come, first-serve basis. The registration fee is $250 and includes continental breakfast, coffee breaks and lunch.

For more information and to register, visit

Wednesday, November 5, 2008

NGS (National Graduates Survey)/FOG PUMFs


Could we get an update on when/if a pumf is likely, please? Researchers are starting to see publications based on the RDC data, and want to be able to verify/expand these analyses, but can't, which is frustrating for both the researchers as well as us. If they can't verify the results, they often won't quote them or use them in teaching either, which sort of defeats the purpose of doing the surveys in the first place.

And no, a couple of excel tables of aggregate stats from the 2005 FOG is not a sufficient sop.


The latest information we have about the availability of pumfs from either the NGS 2000 or FOG 2005 is:

"pumf in March 2004 (dlilist 2003/02/03);"
"availability of a public use microdata file uncertain, maybe by spring
2006 (dlilist 2005/09/19);"
"April 2006 (dlilist 2006/01/19);"
"Production of a pumf currently on hold due to workload (dlilist
2006/07/12); "

Inquiry on "Gross Rent" variable in 2001 Census


A student has asked me about the following 2001 Census table:

This table shows ‘Gross Rent as a percentage of 2000 Household Income’. What is confusing under ‘gross rent’ is the presence of both a “50% and over” category and a “50-99%” category. Why have both categories, and what is the practical difference between them?

The 2001 Census Dictionary ( ) does not include the 50-99% category for this variable.


Owner.s Major Payments or Gross Rent as a Percentage of Household


Part A . Plain Language Definition

Percentage of a household.s average total monthly income which is spent on shelter-related expenses.

Those expenses include the monthly rent (for tenants) or the mortgage payment (for owners) and the

costs of electricity, heat, municipal services, etc. The percentage is calculated by dividing the total

shelter-related expenses by the household.s total monthly income and multiplying the result by 100.

Part B . Detailed Definition

Refers to the proportion of average monthly 2000 total household income which is spent on owner's major

payments (in the case of owner-occupied dwellings) or on gross rent (in the case of tenant-occupied

dwellings). This concept is illustrated below:

(a) Owner-occupied non-farm dwellings:

Owner's major payments X 100 = ___%

(2000 total annual household income) /12

(b) Tenant-occupied non-farm dwellings:

Gross rent X 100 = ___%

(2000 total annual household income) /12

Censuses: 2001 (1/5 sample), 1996 (1/5 sample), 1991 (1/5 sample), 1986 (1/5 sample), 1981

(1/5 sample)

Reported for: Private households in owner- or tenant-occupied non-farm dwellings

Question Nos.: Derived variable: Questions 51, H6 (a), (b), (c), H7, H8 (a), (c) and (f)

Responses: Not applicable

Remarks: The response categories used in the census products are as follows: less than 15%;

15-19%; 20-24%; 25-29%; 30-34%; 35-39%; 40-49%; 50% and over.

Any thoughts?


1) I had to consult with the Census division for a specific explanation in regards to your question. Their answer is as follows:

"Gross rent Refers to the proportion of average monthly 2000 total household income which is spent on owner's major payments (in the case of owner-occupied dwellings) or on gross rent (in the case of tenant-occupied dwellings).

The relatively high shelter cost to household income ratios for some households may have resulted from the difference in the reference period for shelter cost and household income data. The reference period for shelter cost data (gross rent for tenants, and owner's major payments for owners) is 2001, while household income is reported for the year 2000. As well, for some households the 2000 household income may represent income for only part of a year.

In the category "Average monthly total of all shelter expenses paid by tenant households", "Gross rent" includes the monthly rent and the costs of electricity, heat and municipal services.

In the case below, 4,795 represents 50% and over while 2,920 represents 50-99%. Therefore 1,875 spend over 99% of their total household income on Gross rent."

I hope this clarifies the numbers shown in the table.

As well, we suggest you use the numbers provided in this table cautiously. There are notes in regards to the reference period for shelter cost and household income data wich you may access by clicking on the link "More information on this product is available here.", provided under the table you have referenced.

2) I received a similar response from our Census consultant, and I also received some information regarding why the 50-99% category wasn't included in the 2001 Census Dictionary. The consultant concluded that this was an oversight because the dictionary is produced before some of the standard products are released:

"The dictionary is usually produced before the standard products are released and at times after analysis of the data by our subject matter experts they may decide to include additional breakdowns or to remove breakdowns do to the quality of the data."

However, she is going to suggest that the 50-99% category be added to the 2011 Dictionary, and asked me to thank you for bringing the issue to her attention.

Difference between Census Access levels 2 and 3


Can someone clarify what we have access to on the Census site that is level 2 (DSP?) and level 3 access, please? I am doing a presentation next week and want to make sure that I'm providing the correct info.

Thanks in advance.


1) Your question creates an opportunity to put in a plug for
the DLI Survival Guide on the STC website. You will find a description
of Level 2 access under the section on "Accessing and Citing DLI Data." See:
The details are under the heading: "Census data -- a more detailed level of geography."

2) Here is a cut and paste form the DLI Survival Guide:

Census data - a more detailed level of geography

Commonly referred to as Level 2 access to Census data, DLI members have access to
additional Census data at lower levels of geography and the additional option to download
to Beyond 20/20 format if they so desire. These access levels apply to the release of the
standard topic-based tabulations, release profile components and the cumulative profiles.

The following summarizes our Internet product availability at Level 2:

* Topic-based tabulations for all levels of geography (except
forward sortation area (FSA) and dissemination area (DA)).
* Release profile components and cumulative profiles for all levels
of geography (except forward sortation area (FSA) and
dissemination area (DA)).
* Dissolved census subdivision (CSD) profile data.

(Although FSA and DA levels are not available through Level 2 access, these levels of
geography are readily available on the DLI FTP site.)

If your DLI-member institution does not currently have access to level-2 Census data,
please contact the DLI unit <> with the IP range for your
institution and we will facilitate the access for your institution.

3) A description of the different Census access levels is available in the Survival Guide:

Also, in replying to a similar question back in April, we obtained the following definitions from our Census consultants:

Level 0 = Available for free to all users via the Internet
Level 2 = Available to pre-determined users, key stakeholders and partners. So in essence our level 2 users have access to everything up to level 2 and is available on our site.

Level 3 = Available for a fee. Key partners and stakeholders, including the DLI, have access to these files however. In the case of the DLI, they are distributed through the DLI FTP site.

4) I was at a Census presentation yesterday and level 3 is internal access for STC. Census does provide Level 3 data to some key partners and stakeholders, including the DLI. However, DLI contacts can only access Level 3 data via the FTP site.

5) When we last talked you had mentioned that you would be interested to know the criteria used by Census in determining whether a file is classified as level 2 or 3. I thought I should share the answer with the group as it elaborates on my recent posting about census access level definitions (below).

I discussed the matter with the Chief of Census Standard Products and Internet Development, and she confirmed that the size of the file determines its level. If it is small enough to be delivered via the web, it will be classified as level 2. However, if a file is too large for dissemination via the web, it can only be distributed via the FTP site and will be classified as level 3. Typically, it is the DA-level files that end up being too large. However, there are a few products at other levels of geography that are too large to be delivered via the website and must be classified as level 3 as well. For example: a detailed industry by detailed occupation table crossed with other variables at the CD/CSD level STILL may have to sit at Level 3 because of size.

Friday, October 31, 2008

Frequency of Name Use by Time Period for Canada

A student here is looking for data on Given names and Surnames. He would like to have time series data showing the name's usage in Canada, from 1935 on to 2008 - preferably, annually. So, for example, how many children were born and given the name "Floyd" for each year. How many children were born whose surname was "Smith" ...

Any ideas?


I have received confirmation from the appropriate divisions that this type of information is not available through Statistics Canada.

My suggestion would be to contact the vital statistics offices of the individual provinces and territories.

Wednesday, October 29, 2008

Census Division (CD) and Census Subdivision (CSD) for Agricultural Land


I have a researcher who is looking for the area (in square miles, acres, etc.) of CDs and CSDs going back to 1871 in order to calculate the percentage of each that is made up of agricultural land. We have located this data using print and online sources for 1871, 1911, 1991 and 2001 but are having trouble with the other years (we only need censuses for years ending in "1").

Is there a way of getting this figure? I could swear that I have seen this information before, but we combed the old print volumes and could not find anything besides the years I mentioned.


I found land area information for CDs and CSDs in the following documents:

1981 CENSUS:

Catalogue no. 95-902 - Census divisions: Population...Selected characteristics
Catalogue no. 95-945 - Census subdivisions of 5,000 population and over:
Population...Selected social and economic characteristics

1971 CENSUS (Special Bulletin):
Catalogue no. 98-701 (SG-1) - Geography: Land Areas and Densities of Statistical Units

1961 CENSUS, Volume 1, Part 1: Population - Geographical Distributions

1951 Census, Volume 1: Population: General Characteristics

1941 Census, Volume 2: Population by Local Subdivisions

1931 Census, Volume 1: Summary


I have found a better print source for 1981.  Land area data for all census subdivisions for 1981 is available in Catalogue numbers 93-901 to 93-912, the Provincial Series, which consist of data published separately for each province and territory.  The relevant tables in this series are Table 1: Population, Land Area and Population Density, for Census Divisions, 1976 and 1981, and Table 6:  Geographical Identification, Population, Land Area and Population Density, for Census Subdivisions 1976 and 1981.

As for obtaining the land area for ALL census subdivisions from 1961 and earlier censuses, I am told by geography division that this data would only be available through a "lengthy and costly custom project" for which they are not currently resourced.  Other than that, the print sources I already mentioned are what is available.

I hope this answers your question.

Friday, October 24, 2008

Inquiry on EA Variables from the 1996 Census Analyser


We have a researcher who is experiencing difficulty with an EA variable taken from the 1996 Census Analyser. Can anyone verify what exactly is being measured in the variable for ‘gross rent spending 30% or more of household income’ for EAs in the 1996 Census. 1991, 2001 and 2006 data is not causing any problems, but I am unsure how to proceed in answering our researcher’s question regarding the 1996 entry.

Note: she is using the V1609 and V1607 variables for her calculations.


It is the count of tenant-occupied household which spend 30% or more of household income on rent. I went back to the B20/20 file, to see if there was a more informative footnote for that field, and the footnote in the
B20/20 file is just as misleading as the item label.

"Gross Rent as a Percentage of Household Income" refers to the proportion of average monthly 1995 total household income which is spent on gross rent (for tenant-occupied dwellings). Calculation - Gross Rent X 100 Total annual household income in 1995 12.

This implies that the item contains a proportion - it doesn't, it contains a count.

Wednesday, October 22, 2008

SLID 2006 Release date


Unless I'm mistaken and, of course that is always a strong possibility, I don't see SLID 2006 available through DLI yet. Is it coming soon?


The 2006 SLID PUMF should be available in December 2008:

2008 Federal Election Boundaries


We have received a request for the 2008 Federal Election Boundary Files.
I note that the ESRI Shape File, the ESRI Arc/Info Export File, and the ESRI Arc/Info Coverage file, for the Federal Electoral Districts of Canada (2006 Election) appear to be freely available from the NRC web site at
In contrast, the web page for the 2006 Census Geography products at lists the federal electoral district boundary files with a ($). (see also:
As far as I can tell, these 2006 Census Geography Products have not been released to the DLI?
So… I am not sure how to interpret these two apparent sources of the 2006 Federal Electoral District “geospatial” files. But… the patron has requested the 2008 Federal Election boundary files. Are the above 2006 files the same as for the 2008 election?

The FED 2006 are the based on 2003 representations and they were the ones used for the 2008 elections. They are revised every 10 years. There are no such thing as 2008 FEDs. These FEDs boundary and cartographic files are available on the DLI FTP site under the 2006 Geography. They are part of the national coverage.

Statistics Canada sells the boundary files and this is why you see the $ sign when you get to download them from the site. DLI members have them for free from the DLI FTP site.

As for the FED files from NRC, I cannot tell you if they are different or not from what StatCan is offering. Other organizations may have created their own boundary files and it's up to them to decide to make them available or not. Furthermore, secondary distributors have paid royalties to StatCan for boundary files. It's then their choice to sell it or make available free. They may have chosen to have the FED made available free in this case since it can be a good marketing strategy.

The next revision for FED 2003 representation may happen shortly so we may have a new representation for 2011 Census.

Monday, October 20, 2008

Canadian Business Patterns Question

A researcher is interested in the following information:
1. Number of retail food outlets in British Columbia, by the smallest area unit possible (e.g. by Census Division, Census Consolidated Subdivision, whatever is available).

2. The above data broken into types of food outlets (e.g. grocery stores, at the same area unit.
The researcher sent her request directly to and they redirected the researcher to me, with the following accompanying information:

To obtain the information you have requested, it is available from the

Canadian Business Patterns (Catalogue No.: 61F0040XCB).

The Canadian Business Patterns (CBP) contains data that reflect counts of

business establishments for nine (9) employment size ranges, including

"indeterminate", for Canada, the provinces/territories, census divisions,

census subdivisions, census metropolitan areas and census agglomerations.

The establishment counts are divided by industry based on the North

American Industry Classification System (NAICS 2, 3, 4 and 6-digit level)

and the Standard Industrial Classification (SIC 1, 2, 3 and 4-digit


A second product called "Canadian business patterns - revenue ranges"

(catalogue no.: 61F0102XCB) expands the basis of CBP. This product

provides a count of businesses based on revenue ranges. Users can thus

analyze businesses on the basis of industry, revenue and geographic area.

Data are provided at the three-digit level of the North American Industry

Classification System, encompassing about 99 industries, or sub-sectors,

and by six revenue ranges.

These products are available for free to Canadian educational institutions

participating in the Data Liberation Initiative. Please note, however,

that the DLI is limited to students, faculty and administration for

academic research and teaching purposes.

Problem #1: I had already downloaded the Canadian Business Patterns (CBP) data from the FTP site but I have never tried to use this product. I have now installed it on my PC, using the install instructions at but I am getting an error. (The error message tells me to install the CDROM in the appropriate drive. Both the NAICS and CBP folders are on my C drive, in their respective folders, as installed.) I phoned the folks at and apparently someone will help me troubleshoot this on Monday. Unless, you can help troubleshoot?

Problem #2: I am not sure where to find the second product "Canadian business patterns - revenue ranges" on the DLI FTP site. Is this product part of the DLI Collection?


As far as the Canadian business patterns - revenue ranges is concerned, this product was last released in 2002 as the Canadian Business Pattern is released frequently. I am not sure of the relevance of using this one while using the last CBP.
I did check with Subject-matter and the latest version released was the 2000 issue released in 2002. That copy is on the FTP site at the following link:

As mentioned, it may not be relevant to the latest copy of the CBP that you will be using.