Wednesday, May 30, 2018

Number of people with Law degrees for Ottawa-Gatineau CMA for the 2016 census

A researcher is looking to know how many people with Law degrees there are in the Ottawa-Gatineau CMAs and where these might fall in the hierarchy of Highest certificate, diploma or degree achieved (specifically, Bachelor’s / Master’s / Doctorate) for the 2016 census. 

I think I have found the right Classification of Instructional Programs (CIP) to get the required breakdown (under “22. Legal professions and studies”), but I am currently unsure how to cross-tabulate these for the specified region. 

Might this require a custom tabulation?  Any help is appreciated!

“You can use this table (98-400-X2016241): Highest Certificate, Diploma or Degree (15), Major Field of Study - Classification of Instructional Programs (CIP) 2016 (82), Age (9) and Sex (3) for the Population Aged 15 Years and Over in Private Households of Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations, 2016 Census - 25% Sample Data

Highest Certificate, Diploma or Degree (15), Major Field of Study - Classification of Instructional Programs (CIP) 2016 (82), Age (9) and Sex (3) for the Population Aged 15 Years and Over in Private Households of Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations, 2016 Census - 25% Sample Data

From the dropdown menu for ‘Major field of study - Classification of Instructional Programs’, select 22. Legal professions and studies and change ‘Geography’ to ‘Ottawa-Gatineau’ to view the data on the web.

You can also download the whole table in .csv and use filter or download the beyond 20/20 format and change the dimensions of the variables you want to display.”

Economic Performance of Ontario Graduates

Question:I have a Masters student who is interested in looking at how Ontario graduates perform economically pre 2003 and post 2003. She seems pretty wedded to using the Census and DoB to differentiate between pre/post graduates-- in addition to including place of origin, language, income and marital status.

Does Ontario have a suitably anonymized dataset that might meet her needs? Another option would be to apply to the RDC but I don't think her timeline would accommodate that.

Answer:“The National Graduates Survey (NGS) has data regarding students who graduated in 2000 and in 2005.

The following link provides a list of CANSIM tables that provide data from the survey:

Please see the attached CANSIM table user guide. In each table, click the ‘Add/Remove data’ tab to uncheck ‘Canada’ and check ‘Ontario’. Scroll down and click ‘Apply’.”

The DLI has the NGS PUMFS from 1982-2013 available through Nesstar Webview. The following are direct links to the cycles that Subject Matter has referenced: National Graduate Survey, 2000 - Follow-up and National Graduate Survey, 2005.

If the student wanted to explore the variables available on the master file, they are also available through Nesstar. The following are direct links to the zero frequency master files: National Graduate Survey, 2002 - Follow-up and National Graduate Survey, 2005.

The NGS is also available on the EFT at: /MAD_PUMF_FMGD_DAM/Root/5012_NGS_END/

Tuesday, May 29, 2018

LFS pre and post 1975

I have a student looking at provincial unemployment rates from the 1960s to present. She’s using the LFS. More specifically, she’s using the Seasonally adjusted labour force statistics for 1966-1972 and the Historical labour force statistics for 1973 to 1989 and CANSIM 282-0002 for 1990 to present, although I think 282-0002 goes back to 1976.

282-0002 doesn’t state if those annual provincial unemployment rates are seasonally adjusted or unadjusted anywhere I can see. In the Historical labour force statistics (1973-1989), the annual unemployment rate can be found in the unadjusted seasonally table. Are the provincial unemployment rates in 282-0002 unadjusted seasonally? Would those figures be comparable with the provincial unemployment rates in the Historical labour force statistics (which are unadjusted seasonally)? Are pre and post 1975 LFS annual unemployment rates comparable?

“If the client is referring to annual average data, they are all calculated using monthly unadjusted data. Thus, CANSIM table 282-002 are annual average data based on the average of 12 months of unadjusted data. If the client wants to use monthly seasonally adjusted, I would advise him to use CANSIM table 282-0087. It goes all the way to January 1976.”

Friday, May 25, 2018

Higher Education in R & D

I have a researcher here who’s looking for stats or data on higher ed and R&D (classification of programs, educational attainment, etc. ). I know there are several surveys that address the subject specifically (HERD, RDHES to name a couple) but they appear not to be accessible to us.

Are there other resources he might access to find similar content? Failing that, are these surveys available through custom tabs?

Follow-up Question:
I have done some further research into HERD, and have gained access to the CAUBO databases, which has provided me with university financial information regarding various groups of research expenses. I have, however, come across an issue. The data collection citations of all HERD estimates state .

"HERD uses two main sources of data found in the public domain: the Canadian Association of University Business Officers (CAUBO) and the University and College Academic Staff System (UCASS). CAUBO provides data on sponsored research and operating expenditures of all relevant universities. UCASS is the source of academic professor counts at the postsecondary level by subject taught. The extracted data are used to derive and produce aggregated HERD estimates by funder, province and science type, in the higher education sector."

Looking at 2015, Statscan notes that HERD is estimated at 12892.4 (in millions of dollars) making the Expenditure on Higher Education R&D almost 13 billion dollars. However, CAUBO accounts for only, at most, roughly 7 billion dollars of this expense, from what I can see. 

Essentially, my question is, where is the remaining 6 billion dollars? Certainly, some of it must go into research hospitals, poly-techs and independent colleges, but CAUBO does not seem to have this information. If that is the case, then I am also confused as to where Statscan is drawing this data from.

“HERD & HE R&D personnel currently belongs to ISTD. 

Higher Education (HE) RD expenditure and HE personnel data are modelled.   

For HERD expenditures CAUBO data provides the non-sponsored research and development amount.  The model estimates non-sponsored research expenditure based on  time-use coefficients applied to UCASS data.  Additional indirect costs of research are also added.  The higher education sector universe encompasses postsecondary institutions (universities, colleges) as well as research centres and affiliate teaching hospitals.  Estimates for medical research from hospitals are also derived from the model.  So the additional $6 billion covers these expenses.   Details on the model can be found here:   

o    Estimation of research and development expenditures in the higher education sector - PDF, 109.82[PDF, 109.82 kb]

The initial question seems to have been regarding Higher education personnel and table 358-0159.  HE R&D personnel are also modelled.  It should be noted up front that all R&D Personnel is meant to represent R&D full-time equivalents only.  So 1 FTE can represent 10 employees working only 10% of their time on R&D or 2 employees working 50% of their time on R&D.   The HERD model includes time-use coefficients to determine the amount of time university academics (professors) spend on R&D.  The administrative source for information on the counts of full-time university professors comes from the Higher Education in Research and Development (HERD) model which uses data from the University and College Academic Staff System (UCASS).  Information on doctoral student counts is obtained from the Post-Secondary Student Information System (PSIS). Post-doctoral researcher information is obtained from external granting councils that keep track of the information on postsecondary R&D applicants.  Postdoctoral fellowship information is a key variable that is obtained from the three granting councils: Natural Sciences and Engineering Research Council (NSERC) is used to allocate natural science R&D, Social Sciences and Humanities Research Council (SSHRC) is used for social science R&D and the Canadian Institutes for Health Research (CIHR) health science R&D.

We were hoping to start a HE sector survey this fiscal to benchmark this sectors expenditures and personnel, but unfortunately funding has been delayed.”

Questions about use of PUMFs

The requesting user is an adjunct faculty member at the University (“in good standing”), but wishes to use the data for a non-University related consulting job with outside nonprofits.

My reading is that she is an authorized user, but the language around authorized use is a little ambiguous, despite my having read the case studies. Access is granted for licensed users to use the data for “research and statistical purposes”.

I don’t see anything that specifically says that the research and statistical analysis needs to be academic in nature, but is that implied? Or is it fine for her to conduct research for pay on behalf of an outside client as long as she only shares with her client the results of her analysis and not the underlying data?

Your interpretation is correct, it is implied that the research and statistical analysis needs to be academic in nature. Even if the adjunct faculty member were to only be sharing the results of her analysis and not the underlying data, her use of the data for a non-University related consulting job with outside nonprofits would fall outside of the licence agreement.

Question about GAF 2001

 We have a student who is looking for the Geographic Attribution File for 2001, we’re having a difficult time locating it so I’m wondering if it exists, we can find other years.

Subject Matter has informed us that the 2001 GAF was not disseminated. It was available through GeoSuite as blocks (renamed Dissemination Block in 2006). An Excel file (CB2001.xlsx) of the GeoSuite extraction that was made using GeoSuite 2001 is available on the Statistics Canada website: You can download GeoSuite 2001 from the link above, do a name search for Canada, select CB (Census Block) and export the fields needed.

Thursday, May 24, 2018

CCHS Nutrition Slides / ECCS Nutrition presentation

I’m wondering when these slides will be posted in the DLI Training Repository? I’d like to point some faculty members to the presentation.

The CCHS/ESCC Nutrition slides have now been uploaded to the DLI Training Repository.

Please see the links below:

2015 Canadian Community Health Survey (CCHS) – Nutrition

Enquête sur la santé dans les collectivités canadiennes (ESCC) – Nutrition de 2015

Monday, May 14, 2018

Access to PUMFs for non-DLI

In the past DLI was only one way to get access to PUMFS, is this still the case? What is the difference between the DLI and the “Access to PUMF Collection” subscription described here ?

As well, in the past, individual researchers not affiliated with the DLI could request copies of a PUMF directly through the STC website free of charge, is this still the case? For example order links are available on public PUMF product webpages see:

The difference between the DLI and access to the PUMF collection comes down to the additional resources that are provided to subscribing institutions. DLI clients get access to a dedicated reference desk for DLI contacts, a forum to discuss data issues and concerns, access to a network of Canadian data librarians, as well as access to training and outreach sessions. The DLI is only open to Canadian postsecondary institutions. In contrast, the access to the PUMF collection is available to all institutions. The purpose of the PUMF collection is an amalgamated licence that gives subscribing institutions unlimited access to all PUMFs. Both programs provide access to data through the corporate Electronic File Transfer (EFT) service as well as through the web data portal, Nesstar.

Users can request PUMFs from Subject Matter areas for free, but each request has an associated licence agreement that the user must sign. While PUMFs are free, the subscription based fee supports the technical infrastructure access, licence administration fees and user support.

Thursday, May 10, 2018

2017 PCCF Questions

Question 1: 

 A researcher is using the 2017 PCCF, and has noted some issues:

“It appears that there are a significant number of postal codes missing.  I had thought that these might be retired postal codes but there are none in the retired file for Manitoba (which is also suspicious).  I had not thought about this earlier when I was just connecting postal/census areas and then doing summaries but in the last week we needed to do some work with specific postal codes and found a drop from the 2011 files.

Examples would be Churchill (R0B0E0), Stonewall/Teulon (R0C3B0), some in Portage (R1N3C4, R1N3C2), St. Laurent (R0C1P0) (and 300 others).   These are currently active postal codes checking with Canada Post and existed in the last PCCF.”

Would it be possible to have someone look into this? 

Answer 1:
We are aware that many postal codesOM did not go through our geocoding process in the June version, and therefore did not make it to the PCCF file. We did a lot of work on getting these postal codesOM back into the product file. We have also appended records that were coded to only the CSD/SGC to the PCCF file. As these do not currently geocode (to at least a Dissemination Area), they do not form part of the core PCCF product, but can be useful for using postal codes to get to CSD.

Also, since the postal code is intended for the distribution of mail by Canada Post the files undergo changes on a regular basis. In some instances, postal codes are retired and re-birthed with a different deliver mode type or the address information for postal codes is changed.

Question 2:
The researcher has responded: 

“I only sent a small number of postal codes and they are relatively important, at least in Manitoba.  I can send the whole list if necessary.

I am thinking the numbers are large enough and the areas of interest that it should be checked.  If this is an issue that is a more general problem (e.g. Canada wide) not just in our little province with a few hundred postal codes.

The postal codes are not retired.  I realize there are changes, but the bulk of the codes should not have moved.  Many [I have not checked all} of them existed in the 2011 associated PCCF files.

Interesting comment regarding the CSD given some rural delivery, I am not sure I am interpreting what this means.  Some postal codes that represent larger areas or only CSD are not included?  “

Since I have not seen any other similar inquiries, I too wonder if this is just a Manitoba issue? The user has offered to send the whole list of missing postal codes.  Would that be of any use,  or would Subject Matter prefer to communicate directly with the user? 

Answer 2:
We have received the following response from Subject Matter:

“The postal code team geocodes/links the postal codes as received from Canada Post Corporation to a census geography. They try to link at the lowest level possible, a block face. If they can’t do that, they move up to try to link to a dissemination block, if not then a dissemination area, if not then a CSD. Because the CSDs, especially in rural areas, can be very large, so if the only link between a postal code and a census geography is at the CSD level, it is not automatic that these postal codes are included in the PCCF. The team has made efforts to include more of them manually.

We are not saying that postal codes that represent rural CSDs are not included. That is a misinterpretation.

We were not able to provide specifics about why these particular postal codesOM were not found in the Dec 2017 release of the PCCF. However, we checked each against the newest internal release of the product, and they did exist in that file. We are hoping to disseminate a new version of the file sometime this summer. The postal codesOM in question should be available in that new file."

Tuesday, May 8, 2018

Recoding occupations into social classes

Basic sociology question. I have a student asking me how to recode professions (NOC-S) into "social classes" (upper, middle, working class). He wanted to do it with those 10 categories. I told him there is no way you can do that with such broad domains of activity since the lawyer is in the same category as the legal assistant and an administrator is in the same group as the secretary. So he asked me what about those 30 categories. It still looks problematic to me. You have more distinctions between professional and technical occupations but how do you define what fits into "middle class"? So to anyone who is familiar with this, my question is are there "official norms" to divide the National Occupational Classification into "social classes" and what level of detail is necessary to do so? I found this article of Boyd in the Canadian Review of Sociology (appendix B) that defines the Boyd-NP scores for the 2001 census. Are those « officially » recognized and can they be adapted to the 2011 NHS?

A Socioeconomic Scale for Canada: Measuring Occupational Status from the Census 
Monica Boyd, University of Toronto
Canadian Review of Sociology

Answer 1:
There used to be two scales - Pineo-Porter classifications, and the Blishen scale – each SOC code was assigned a value which mapped to “class”. If there is a crosswalk between NOC and SOC, it might be able to update Pineo-Porter classifications or the Blishen codes to the NOC … or someone might even have already done so!

Take a look at –

A Scale of Occupational Prestige in Canada, Based on NOC Major Groups
John Goyder and Kristyn Frank
The Canadian Journal of Sociology / Cahiers canadiens de sociologie
Vol. 32, No. 1 (Winter, 2007), pp. 63-83
Published by: Canadian Journal of Sociology
DOI: 10.2307/20460616
Stable URL:
Page Count: 21

Answer 2:
Subject Matter does not use the term ‘social classes’ when classifying occupations so there is no ‘official norm’ at Statistics Canada.

For the ‘Canadian Review of Sociology’ article we couldn’t open the full article and not sure what is mentioned in appendix B.

But from the text in the ‘Abstract’ it looks like the article mentioned ‘Nam-powers-Boyd’ method was used for Census of Occupation and not Census of Population.”

2015 Household and the Environment Survey Weights

I have a student working with the 2015 HES survey and asked about the weights available for the PUMF. The user guide states: "The sample design used for the HES 2015 was not self-weighting. When producing simple estimates including the production of ordinary statistical tables, users must apply the proper survey weights."

The only weight variable available in the PUMF is the WTHP: PUMF, survey weight of household. Am I correct that this is what the user guide refers to as the “proper survey weights” and that there are no other weights variables for this PUMF?

The client interpretation is correct.

Follow-up Question:
The student has asked why there is a difference between the HES CANSIM tables and the HES PUMF with the weights applied. Can Subject Matter provide any information on this?

Follow-up Answer:
“The Households and the Environment Survey reports on many values in terms of “all households”, as in “x% of Canadian households that had a thermostat indicated it was programmable”. In general, this is calculated as follows:

In other words, all of the households that are in-scope to be asked a question appear in the denominator (in this example households that had a thermostat). That a household might have said “don’t know” (DK) or refused to answer the question (RF) doesn’t change the fact that they were in-scope for the question and thus should be in the denominator.

“All households” is a special case because it includes all households regardless of whether a respondent might have answered “don’t know”, refused to provide an answer to the question or was a “not stated” (was not administered the question for some reason other than flow), so the calculation is more like this:

However, it appears that calculates percentages slightly differently, more along these lines:

The households that were “don’t know”, refused to provide an answer to the question are not taken into consideration in the denominator.

This can lead to different estimates, which is what we’re seeing here. For the specific estimate that the researcher is asking about, here’s what I found:

Weighted (WTHP)
Reported a forced air furnace as a primary heating system (%)
Reported a forced air furnace (numerator)

All households (i.e. all HH in-scope for the question about primary heating systems)
I hope this sheds some light on the situation.”

Thursday, May 3, 2018

Data on post-secondary students

I have a request for data aggregated to show where post secondary students live by small area geographies in southern Ontario. The largest area that would be useful is census subdivision, but census tracts would be ideal. Does anyone know if this data is available? Would it be a custom tabulation? I think the data may be collected in the Post-secondary Student Information System.

If it is possible to get this data we would be interested in having it broken down by institution type, Classification of Instructional Programs (Primary Grouping) and Student Status (international/ Canadian Students).

Data at the level of provinces is the smallest geographic area that we can provide from the Postsecondary Student Information System. We can provide data about students whose permanent address upon admission is Ontario.

Wednesday, May 2, 2018

WDS server

Question:With difficulty, I found the link to the WDS server yesterday, but have not managed to repeat the feat today.
It is described in the Survival Guide:

Why is this so hard to find? (I may be the only one with this problem, but I hope not.)

It's behind the "Restricted access ..." link. The way I remembered this is that I have to have the link proxied to add it to my own web pages. It's not an open product.

Tuesday, May 1, 2018

Tuition and Living Accommodation Costs 2017-18

An administrator is looking for the 2017-18 iteration of these files. I checked on Odesi and on the Statistics Canada B202/20 server and could not see this version. In theory the data has been release in September 2017 (​) . When can we expect the files?

We located the files. The Tuition and Living Accommodation Costs (TLAC) for 2017-18 (ver09) have been available since October 5th 2017 on the DLI EFT site.

Here is the file location: /MAD_DLI_IDD_DAM/root/other_autres/3123_TLAC_FSS/

The files will also be uploaded to WDS shortly.