Monday, January 28, 2019

DB Level Census Data

I have a grad student and faculty member who want to examine detailed census data at the DB level, from 2016, 2011 (NHS) and 2006, to analyze neighbourhood change over time. They have already looked at the DA level data available in the CHASS Census Analyser, which seems to stop at the DA level.

I know in theory that this data at the DB level would be available to a researcher at the RDC, but I also suspect that one needs a quite well defined research proposal to gain admission to RDC data. At this stage these researchers want to browse the data to see where interesting investigations may be revealed, so I’m not sure they would qualify at this stage for RDC.

I am suspecting that the available census/nhs pumfs would provide no greater granularity than the CHASS system, but I’m not sure of this, so *please* advise me on the best course of action at this point. We have, in the past, had researchers purchase a custom tab for a neighbourhood larger than the DB but smaller than the FSAs, but that might be expensive if a number of neighbourhoods are included in the request. I’m leaning toward sending them to consult at RDC, but am happy to learn from you what else I might do to enable these researchers.

I’ve consulted with subject matter to get their opinion on what might be the best course of action and they’ve given me the following:

“The CHASS Census Analyser is not a product supported by STATCAN so we cannot speak to the geography level which is available there. However, the PUMFs would not go beyond DA either.

We would recommend having a look at the standard profiles at the DA level available on our website. From there, if the client would like to narrow down an area it would have to be processed by one of our Data Service Centres.

I’m not familiar with how exactly the RDC would process the request but I assume, they too would need a more directed demand for DBs and reasoning.

Here are some links to the standard DA profiles for the years requested:
2006 []
2011 []
2016 []”

My suggestion would be to a) take a look at these standard profiles and see if they help at all, and if not, b) contact the RDC to see about the possibility of accessing the data through there. If time is of an issue, the custom tabulation route will probably be faster, however more costly.

Friday, January 25, 2019

Ontario Election Data: 1995 & 1999

I’m seeking shapefiles (or other spatial format) for Ontario election precincts from the 1995 and 1999 provincial elections, as well as poll-by-poll results in any digital format.

I know that the Ontario election districts were the same as FED for the 1999 election, but I’m not sure if that extended to the poll level precincts as well.

A student here wants to map poll-by-poll data for those elections, and we haven’t been able to find digitized data for either boundaries or results from Elections Ontario, the Canada open data portal, Scholars GeoPortal, or UToronto’s data collection for 1995 and 1999. Elections Ontario told the student that “some university libraries might have that data”.

Subject matter has responded with the following:

“StatCan does not have any election geographies below the FED level, nor data for results. You could try the Ontario Archives (, or maybe Elections Canada. Also, the provincial gazette of Ontario may cover elections, but I don’t know about that.”

2015 CCHS-Nutrition PUMFs and Master Files

I have a researcher who is interested in knowing the fruit and vegetable consumption of Canadians and wonders if either the PUMF's or Master Files from the 2015 Canadian Community Health Survey (CCHS)-Nutrition will provide such data. The researcher would ideally like to know the amount consumed (e.g. number of apples), the variety (e.g. what kind of apple?), and whether the food item was organic.  

I have reviewed my notes from the Jan. 30, 2018 webinar on the 2015 CCHS-Nutrition​ and scanned the documentation for the eight files in ODESI. I have also had a quick look at the PUMF's in ODESI. It is clearly a very complex survey! As far as I can tell from the documentation, this type of data and level of detail is not available from the PUMF's. Is that correct? 

​​Slide # 23 from the webinar notes that questions about fruit and vegetable consumption were dropped from the Health Component module of the 2015 CCHS-Nutrition. I wasn't sure how to interpret the following information from the Reference Guide to Understanding and Using the Data:  2015 Canadian Community Health Survey -- Nutrition (June 2017):

"The PUMF for 2015 CCHS-Nutrition will include data on nutrients from foods, a summary of vitamin/mineral supplement use, and the health questionnaire data. Additional data at more detailed levels such as food, ingredient, and recipes along with Canada Food Guide tiers may be included depending upon the file structure and the results from a mandatory confidentiality review.” (p. 41)

If the information is not available from the PUMF's, would it be available through the Master Files which are in the RDC's? Unfortunately, metadata for these Master Files is not yet posted on Statcan's NESSTAR server. 

We’ve received the following response from subject matter:

“The CCHS-Annual continues to have the Fruit and Vegetable consumption question module (FVC) as part of their core content asked every year.  The module asks about the frequency of eating fruit in the past month, but doesn’t get into the detail of what fruit.

The 2015 CCHS-Nutrition asked detail about what specific foods are eaten, but only for one day, a 24-hour dietary recall.  The goal of this is to get data on what nutrients are consumed – not to get estimates of what exact foods provide those nutrients.  However, since apples are frequently consumed throughout the population, it is possible to use the survey data to estimate how many apples are eaten on any given day, and the characteristics of people who eat them.  This is not recommended for less frequently consumed foods, say lasagna.

The survey coded each food reported by respondents with Health Canada’s Canadian Nutrient File (CNF) codes, a subset of which are included in the survey documentation known as the FDC file.  This file contains the name description of a food, ingredient or recipe, a code specific to it, and a string of nutrient values for one gram of that food.  The CNF is also available on Health Canada’s website, so the researcher can look at that until the Nesstar metadata are available.  The data are limited to the level of detail available there.  The CNF/FDC file does not have separate entries for any organic versus non-organic foods because there is little difference in the nutrient content. 

For apples, the following information is relevant to the researcher’s questions:

  • There are data related to apples:
    • a general code for apples (summary Bureau of Nutritional Science food group variable FDC_FGR = 40B),
    • various detailed codes (variable FID_FID ) but these only differentiate when the nutrient content is different (e.g. fresh versus canned, dehydrated, frozen).
    • There are no data related to specific varieties of apples (e.g. Spartan versus Gala, McIntosh, Pink Lady) because the differences in nutrients per gram of apple are not statistically important (i.e. the data for all fresh apples would be coded to one code)
    • The code variable FID-FID has to be used in conjunction with the label attached to the code, which are located in two other variables on the FDC file:  FDC_DEN (Food name – CNF – English) and FDC_DFR (Food name – CNF – French).  For example: FID_FID = 1487 FDC_DEN = “Apple, canned, sweetened, sliced, heated” 
  • All of these variables are available in both the master file and the PUMF.  The analysis using the PUMF might be a little easier because the PUMF HS file includes derived variables that aggregate the gram amount of food eaten that day for certain foods, and apple is one of those foods (the variable is BNSD40B – “Gram Weight – Apple”).”

Data on Intentional Self-harm

I have a researcher who is using data on suicide/intentional self-harm to develop models for prediction and prevention. The CCHS and the DAD has taken them most of the way, but they’re also hoping for aggregate data on whether or not someone who has self-harmed and survived intended to die. Apparently they were able to find this information for Australia, and they’re hoping the same is collected and disseminated in Canada (at the country-wide level). I haven’t been able to find anything. Does anyone know if this data available anywhere?

Subject matter has responded with the following:

“The CCHS 2015 – 2016 does include a block of suicide-related questions. This was a theme for this survey. I have attached the questionnaire for the convenience of the client. Kindly have them review the entire SUI block of questions. I think the variables SUI_035 and SUI_040 may speak to their line of questioning. However, the only variables that are available for free through the PUMF are DOSUI, SUI_005 and SUI_010. The client could access data for SUI_035 and SUI_040 through a cost-recovery custom tabulation. 

*Note: The original email contained the CCHS 2015-2016 Questionnaire as an attachment.

Thursday, January 24, 2019

1996 Geographic Attribute File in SPSS

Does anyone have a SPSS or similar 1996 Geographic Attribute File (GAF) which they could please share as a zip file with us (or the SPSS command file used to prepare this file would be preferred).  As the attached record layout attests, this file has a large number of variables, so this would really save us time.

We are working on a good SPSS syntax file for the 1996 Postal Code Conversion File.  Typically we use GeoSuite for a good labels for Census geography names.  However “GeoRef” 1996 is not 64 bit compatible.

There is a record layout in CSV format available in the download zip from the version of the 1996 GAF loaded in the GeoPortal

Not sure if it’s helpful or not, but the spatial data is available in .shp file format (not SPSS).

Census 2016 Data Covering Sign Languages and Demographics

A researcher would like more detail on demographic characteristics for the use/knowledge of sign languages (mother tongue, language spoken at home, languages understood, and languages most used at work, than is available from the Census 2016 profile or Census 2016 TBTs/data tables), for Ottawa and Gatineau. 

The Census 2016 profile gives us three categories of sign languages: American Sign Language, Quebec Sign Language, and Sign Languages n.i.e. = not included elsewhere / non incluses ailleurs*)

The researcher has a few questions…

1.       More specifics about those sign languages covered under “n.i.e./n.i.a. “?  If more is known, is it possible to either get a partial or complete list of these, and if so, could she get contact information for a custom tabulation (contact information)?

2.       Are there more data tables (or a PUMFJ) to come on socio-demographics (i.e., occupation, education and income) for these three sign languages spoken/mother tongue/understood (at least three as below) for Ottawa and Gatineau?   I found a number of Census 2016 TBTs/data tables with three categories of sign languages for Ottawa and Gatineau (but only with the demographics of age and/or sex … or 100% data I think). 

  • a. 98-400-X2016345  Language Spoken at Home (263), Single and Multiple Responses of Language Spoken at Home (3), Mother Tongue (269) and Age (7) for the Population Excluding Institutional Residents of Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations, 2016 Census    

  • b. 98-400-X2016059  Mother Tongue (269), Knowledge of Official Languages (5), Age (15A) and Sex (3) for the Population Excluding Institutional Residents of Canada, Provinces and Territories and Federal Electoral Districts (2013 Representation Order), 2016 Census

  • c. 98-400-X2016057  Mother Tongue (269), Knowledge of Official Languages (5), Age (15A) and Sex (3) for the Population Excluding Institutional Residents of Census Metropolitan Areas, Tracted Census Agglomerations and Census Tracts, 2016 Census

  • d. 98-400-X2016058  Mother Tongue (269), Knowledge of Official Languages (5), Age (15A) and Sex (3) for the Population Excluding Institutional Residents of Canada and Forward Sortation Areas, 2016 Census Agglomerations

  • e. 98-400-X2016075 Language Spoken Most Often at Home (269), Other Language(s) Spoken Regularly at Home (270) and Age (15A) for the Population Excluding Institutional Residents, Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations

  • f. First Official Language Spoken (7), Language Spoken Most Often at Home (269), Age (15A) and Sex (3) for the Population Excluding Institutional Residents

                           i.      98-400-X2016069 Canada and Forward Sortation Areas

                          ii.      98-400-X2016070 Canada, Provinces and Territories, Census 
                                  Divisions and Census Subdivisions

                         iii.      98-400-X2016071 Canada, Provinces and Territories, Census 
                                  Metropolitan Areas and Census Agglomerations

                         iv.      98-400-X2016072 Census Metropolitan Areas, Tracted Census 
                                  Agglomerations and Census Tracts

                          v.      98-400-X2016073 Canada, Provinces and Territories and Federal  
                                  Electoral Districts (2013 Representation Order)

Thank you for any further suggestions for the researcher to pursue.  We note that she could go back into the 2006 PALS survey in RTRA.  If possible, she’d rather have more current sign language / demographic data tables though.

Subject matter has responded with the following:

1.       The Sign Languages, n.i.e. category consists of sign languages that are not Quebec Sign Language or American Sign Language, so for example, the n.i.e category consists of broad responses of “Sign Language”, as well as more specific responses such as Polish Sign Language or Spanish Sign Language. It is not possible to have a custom tabulation with the breakdown of this category.

2.       Any table consisting of the full detailed version of a language variable also has the sign language categories in them. The detailed language variables are with demographic information only. There are also a few language of work tables with detailed variables which show the sign language categories. Here they are:


That said, if you can’t find a table with all the variables you need, you can always request a custom tabulation.

Regarding the PUMF however, the individuals PUMF will be released this winter and it will include variables on language, however I do not believe they will be so detailed as to include the sign language categories.

Tuesday, January 22, 2019

Learning Style data request help

I received a data request and have looked around trying to find something publicly available but haven’t found anything.

The student is looking for:

"So my project is looking to determine the teaching style of a lesson, matched to the Felder-Silverman learning styles dimensions, although I could change learning style measurement systems if it is necessary,, in order to supply students with the lesson in the manner most conducive to their learning.  So I am essentially looking for a data set which contains the students learning style and some measure of how successful in a given learning environment, I was thinking grades in the class would be ideal but if there exists some other measure in a data set that would work."

 A dataset with Learning Style and student grade/success score

Maybe someone on the listserv has seen something similar?

Answer A:
That is going to be really hard to find. The student may have to look to journals. YOur institution however may have an EDC - Education Development Centre, or the institution that provide faculty with teaching and learning, and they may be able to better point the student toward resources and potentially data.

Answer B:
I checked with our Instructional Design Librarian and she is of the opinion that this will be pretty much impossible to find. As [noted previously], she suggested that there might be some articles/studies, but that the data themselves would probably not be available due to confidentiality, vulnerable population, etc.

Thursday, January 17, 2019

CCHS and Colorectal Cancer Questions

I have a doctoral student who’s wondering whether colorectal cancer screening questions were asked of SK respondents. From the documentation, it appears that only NL, NB, PEI and AB are included, but I’d like to confirm that.

Also, she’s like to know if there’s a tentative release date for data from the 2017 survey. I see it’s available through the RDC programme; what about a PUMF?

You are correct, Saskatchewan has not selected the Colorectal screening module (CCS) for 2015-2016.

As for the CCHS 2017 PUMF, unfortunately subject matter is unable to give us a foreseeable date of release. Typically it would be released 2 years after collection, and because the 2018 data is currently in collection (and taking into account the fact that the 2017-2018 will be released together), it is safe to say that it shouldn’t be expected before late 2020.

Wednesday, January 16, 2019

Sexual Harassment Statistics / Data

Lookinig for sexual harassment statistics and/or data.  Finding sexual assault info.  I can find lots of articles discussing the difference between the two.  Suggestions for sources of this harassment info?

One useful site atThe Canadian Bar Association B C Branch has this webpage is quite illuminating.

Angus Reid Institute did a survey in 2014 on Sexual Harassment in the Workplace ( which is available in .

Also, the GSS on victimization (Cycles 28 (2014), 23 (2009), 18 (2004), and 13 (1999)) may have some variables on sexual harassment.

Additionally, I found what appears to be an excellent survey from Australia which won't give us the Canadian data but does look very informative.  May be of interest to people.

Ontario Regional Training Materials

The Ontario Regional Training materials can be found at the following link:

We’ve been having some issues making the documents ‘searchable’ so I apologize that it has not been more easily accessible! Please let me know if you have any issues accessing the documents.

Monday, January 14, 2019

Population Centre -Defining Thresholds for Population Density

I have had a question from a researcher who is looking at population density and is confused by the definitions provided for the 2016 “Population  Centre”  Classification.

The definition on the website notes that a “population centre” is  an area with a population of at least 1,000 and a density of 400 or more people per square kilometre.

There is also some discussion of using “a secondary population density threshold,” but the researcher is having a challenging trying to determine what the upper threshold is. Would someone be able to help?

Subject matter has responded with the following:

“Please refer your client to the Census Dictionary discussion on POPCTR. Here they will find the 3 thresholds and a more comprehensive explanation on the concept. Please let us know if you have any follow up questions.”

Wednesday, January 9, 2019

Inclusion in the Canadian Community Health Survey (CCHS)

I have a researcher who is interested in the scope of 'loss of productivity' measures in CCHS 2015 and 2016 for British Columbia--what the depth of coverage is. I see LoP is listed in the guides but no idea as to the number/robustness of responses.

We’ve received the following response from subject matter:

“For the “loss of productivity” module that you are interested in, the universe for these questions are limited to respondents aged 15 to 75. In 2015-2016 LOP was only selected by BC, if you look in the 2015-2016 PUMF data dictionary the frequency in the LOP questions will represent only those within BC.”

National Symmetric Input Output Tables - 2015

We are pleased to inform you that the following products are now available.

National Symmetric Input Output Tables - 2015

EFT: /MAD_DLI_IDD_DAM/Root/other_autres/1401_IO_ES/15-207-X – 2015 National Symmetric IO Tables

Canadian Community Health Survey (CCHS) 2015-2016 Annual Component

We are pleased to inform you that the following products are now available.

Canadian Community Health Survey (CCHS) 2015-2016 Annual Component

EFT: /MAD_PUMF_FMGD_DAM/Root/3226_CCHS-Ann_ESCC-Ann/2015-2016

Tuesday, January 8, 2019

Release Date For National Graduates Survey (NGS) – Class of 2015 PUMF

Is there anything more concrete than 2019 (as per the Tentative release dates page, item updated July 2018) [for the release date of the National Graduates Survey - Class of 2015]?

We’ve had the following response from subject matter:

“Based strictly on previous NGS release cycles, the best estimate for the time being would be in late fall/early winter 2019, but that can extend into early 2020 if circumstances necessitate.”

Labour Force Survey (LFS) – December 2018

We are pleased to inform you that the following product is now available.

Labour Force Survey (LFS) – December 2018

This public use microdata file contains non-aggregated data for a wide variety of variables collected from the Labour Force Survey (LFS). The LFS collects monthly information on the labour market activities of Canada's working age population. This product is for users who prefer to do their own analysis by focusing on specific subgroups in the population or by cross-classifying variables that are not in our catalogued products. The Labour Force Survey estimates are based on a sample, and are therefore subject to sampling variability. Estimates for smaller geographic areas, industries, occupations or cross tabulations will have more variability. For an explanation of sampling variability of estimates, and how to use standard errors to assess this variability, consult the Data Quality section in the Guide to the Labour Force Survey.

EFT: /MAD_PUMF_FMGD_DAM/Root/3701_LFS_EPA/1976-2018/data/

Thursday, January 3, 2019

Social Policy Simulation Database and Model (SPSD/M) Release

We are pleased to inform you that the following products are now available.
Social Policy Simulation Database and Model (SPSD/M) versions 23.0, 24.0, 25.0 and 27.0