Monday, February 29, 2016

Canadian Community Health Survey questions


Are the addition of these questions permanent (i.e. will they be kept in the later versions of CCHS or they're just starting for a testing phase)? Are these questions asked on everyone or just selected few of the population? Also are there any references that I can refer to about the addition of these questions? Are there any documents that I can refer to about the latest qualitative cognitive testing of CCHS items?
Here are a few of these questions: 
During your lifetime, have you had sex with...?  
1: Males only  
2: Females only 
3: Both males and females 
8: RF 
9: DK 
In the past 12 months, have you had sex with a female? 
1: Yes 
2: No 
8: RF 
9: DK 
In the past 12 months, have you had sex with a male
1: Yes 
2: No 
8: RF
9: DK


The SXB questions changed in 2015 were actually part of a full content redesign. The entire survey was redesigned in 2015 on content, collection, sampling and weighting. So a first point I want to make is that we are taking the stance that 2015 results should not be compared to 2014. As for the questions themselves, they are permanent unless we find some major issues with them in collection. Having been through a full year and a bit with these questions, the only changes that may come about for SXB will be in the categories of some of the mark-all questions. The rest will stay the same. These questions will be available as theme in 2015-2016 and in 2019-2020. They may be selected as optional in 2017-18, but there is no guarantee (it depends on analytical needs of provinces).

As for SDC_Q7B / SDC_Q035, it has been in the survey for many years without being changed and has remained the same for 2015. It has never been flagged as an issue in QT, so there isn’t much documentation on it.

Open Access Data


I've been asked to identify open access data related to the prairie provinces to be used by groups who will be participating in a digital humanities hackathon. I don't know where to start. I think more is needed than population and education numbers, which were two variables suggested to me.


Statistics Canada Data:

For a hackathon, it might be good to provide a link to Stat Can's summary tables because they give an indication about what kind of data might be available for a given level of geography and they usually link back to CANSIM tables. Sometimes it's good to provide a smaller list of varied data as a jumping point.

Here are the tables by province, and the tables by metropolitan area.

GIS and Other open data:

There is some data available under the open government license:

There are some demographic data sets if that’s what you’re looking for:


Geology – GIS – Alberta Geological Survey (

Misc. Data – Various – Alberta Government (

City of Banff – GIS and Data (

City of Calgary – GIS and Data (

Calgary Region – GIS, Data, and Imagery (

Grande Prairie – GIS and Data (

City of Edmonton – GIS and Data (

City of Medicine Hat – Data and KML (

City of Red Deer – GIS and Data (

Misc.– GIS and Data – Open Alberta (


Misc. – GIS – Government of Manitoba (

City of Winnipeg – GIS and Data (

Geology – GIS and Data – Manitoba Dept. of Mineral Resources (


Misc. Data – GIS and Data (

City of Regina – GIS and Data (


Water Data – Table of Data – Environment Canada ( +

Bathymetry – GIS and Images – Canadian Hydrographic Service (

Soils Data – GIS – Agriculture and Agri-Food Canada (

Pollution – Data and KMZ – Environment Canada (

Wind – GIS – Canadian Wind Energy Wind Atlas (

Misc. “Basemap” info (Roads, landcover, names, administrative boundaries, imagery) – Geobase (now closed but data still remains accessible) (

Misc. Data – GIS and Map data - Natural Resources Canada – (

Census Boundaries – GIS – Statistics Canada (

There’s quite a few open GIS and data files at these links:

Also, there might be something of interest in GeoGratis (

Don’t forget non-Canadian sources. NASA has quite a few data sets which may be useful in a humanities context:

Returning back to Canada . . . Environment Canada has historical climate data:

The Bank of Canada also has some statistics by province:

Thursday, February 25, 2016

Municipal finances data


I have been looking at CANSIM Table 385-0024. This table includes a breakdown of municipal expenditures according to different sectors (e.g. protection of persons, transportation, health, social services, education, recreation and culture, etc.).

However, this table comes from the older Financial Management System (FMS) which appears to have been discontinued in 2008 and replaced by the Canadian Government Financial Statistics (CGFS). Within the CGFS, I have found data on municipal expenditures and revenues in CANSIM Table 385-0037, but the municipal expenditures are not broken down according to sectors as it was in the previous Financial Management System. Furthermore, revenue breakdowns don't seem to include intergovernmental transfers in the same way as they used to.

I emailed STC to inquire about whether there is another data table that contains a sectoral breakdown of municipal expenditures such as existed prior to 2009 and they responded to me saying that there is no other data available to the public but that there might be other data available via the DLI.


The most current source of local govt data, with a functional classification is our COFOG series. This can be found in 385-0040. The data series goes from 08-09 to 13-14. It will be updated on March 30, adding one more year and released on a consolidated basis

Wednesday, February 24, 2016

Input Output - National Multipliers


It seems like there haven't been any national and provincial multipliers (15F0046X) <> released in a while. Is there a release date planned?


The last release of the Input-Output multipliers were for 2010. We were not successful in releasing them for 2011 due a shortage of resources. We hope to release the 2012 version later this year.

Thursday, February 18, 2016

NHS and 2011 census -- common questions


I was asked a question about the differences between the 2011 census and previous censuses with respect to questions on language. The major difference between the 2011 and 2006 census on language was that the language of work question was relegated to the NHS.

1. How were the responses to questions that appeared in both the 2011 and the NHS handled?
a) Were the NHS responses ignored for common variables?b) Why were they asked twice in any case, since the census was still compulsory?c) If they were asked twice, what happened if the responses were different between the two surveys?d) For NHS datasets that include variables that appeared in both surveys (e.g., age groups), were the NHS responses used, or the census responses?
2. Are NHS response rates for individual questions available, as opposed to the global response rates for particular geographies?


1. How were the responses to questions that appeared in both the 2011 Census and the NHS handled?

(a) Were the NHS responses ignored for common variables?

When responses for both questionnaires were processed, there was a harmonization procedure applied.

When the household was the same for both the Census and NHS, some rules were put in place to compare the number of fields responded for the common variables (i.e. demography and languages).

- First step: demography variables were compared. If the number of fields responded on the Census was better than the NHS, the Census responses were kept for both the Census and NHS (i.e. Census demography responses were copied to NHS, so the original NHS responses were not retained). Otherwise, original responses were kept for both questionnaires.

- Second step: if responses to demography variables were copied to NHS then the number of fields responded for language variables on the Census and NHS were compared. If the number of fields responded was higher on the Census than the NHS, the Census language responses were kept for both the Census and NHS (i.e. Census language responses were copied to NHS, so the original NHS responses were not retained). Otherwise, original responses for languages were kept for both questionnaires. If original demography responses were kept on both questionnaires, then original language responses were also kept.

The copy of responses from Census to NHS occurred for very few questionnaires. About 5.9% of persons on the NHS database had their demography responses modified and 1.2% had their language responses modified.

When the households were not the same for both the Census and NHS, the original responses were kept for both questionnaires.

(b) Why were they asked twice in any case, since the census was still compulsory?

The decision was made to conduct the voluntary NHS during the same period as the census in order to take advantage of census resources and infrastructure such as collection management systems and employees. Questions planned for the long-form remained unchanged, including the language questions, for logistical and methodological reasons given the late decisions to change the 2011 program. Persons selected for the NHS and who responded to the census on the Internet were given the opportunity to complete the NHS online immediately after finishing the census questionnaire. When they did follow-on with the NHS, responses to common variables were automatically copied over and were then the same for both surveys. When they did not follow-on with the NHS right away, they were again offered to respond at a later time and if they did accept to respond to the NHS, a second set of responses was then obtained for the common variables. Other persons selected in the NHS sample chose to respond to the Census on paper. They were later contacted again and offered to complete the NHS and if they did accept to respond to the NHS, a second set of responses was then obtained for the common variables.

(c) If they were asked twice, what happened if the responses were
different between the two surveys?

Please see answer to question (a).

(d) For NHS datasets that include variables that appeared in both
surveys (e.g., age groups), were the NHS responses used, or the
census responses?

Please see answer to question (a).

Regarding response rates:

Response rates for individual questions are NOT available.
Imputation rates for individual questions are available in the different Census / NHS references guides at the national, provincial and territorial levels. They are not available for lower levels of geography.
The imputation rate is the proportion of respondents who did not answer a given question or whose response is deemed invalid and for which a value was imputed. Imputation improves data quality by reducing the gaps caused by non-response.

2011 NHS Reference guide for Language questions

Cost to health care system for the geriatric population


I am looking for data on the costs to the Canadian health care system for the following conditions in the geriatric population:

-restraints (physical or chemical)
-responsive behaviours associated with dementia OR psychogeriatrics

I have found a few things on the CIHI website that address health care costs but was wondering if there is anything at StatCan that I have missed.


Thanks for the question about the continuing care data at CIHI. Unfortunately, it is not possible given that people can have several of those conditions and in long-term care there’s no most-responsible reason for being in home and continuing care (as there would be in acute care).

Tuesday, February 16, 2016

Historical aboriginal justice statistics


A researcher here is interested in looking for statistics on aboriginal offenders. Specifically, she is looking for statistics on aboriginal offenders from 1900-1970. What she is looking for is the number of aboriginal offenders, especially the years where they started to grow. She believes they were under-reported in earlier years, but she is not sure when they started to really increase. She looked at Juristat but it only goes back to the mid 1970's. Would you happen to know of any sources that went back earlier than that?

This Juristat report 
refers to the lack of data on this topic.


The annual reports of the Superintendent of Penitentiaries and Commissioner of Penitentiaries contain some (minimal) stats on Aboriginal inmates.

The reports of the Superintendent from 1919 to 1945-46 and the reports of the Commissioner from 1947-48 to 1967-68 are available online in PDF via the Public Safety Canada Library Catalogue. Hopefully these are persistent links:

I looked at only 3 or 4 reports. In the 1920s and '30s the statistical groups were "White", "Coloured", "Indian" and "Mongolian".

In 1959-60 the Commissioner's report gave numbers for male (table 24) and female (table 57) "North American Indians".

Stats-Can does not have that data. The best they have is counts of Aboriginals accused of Homicide, but only as of 2014.

Thursday, February 11, 2016

Municipal taxes by province?


Are the data in CANSIM table 3800080 available broken down by province, instead of just at the national level of geography? The researchers are interested in finding a breakdown of municipal tax measures for each province, to see where there are differences in municipal tax measures.


CANSIM Table 385-0037 provides statement of operations and balance sheet for municipalities and other local public administrations across provinces and territories. The client can click the Add/Remove tab in the table to manipulate the data as they wish.

Here is a link to the table:

Tuesday, February 9, 2016

General Social Survey Cycle 27


There are 2 different GSS Cycle 27. I am just wondering about this.


One is Social Identity and one is Giving, Volunteering and Participating. They are in the same cycle because they were collected at the same time and both surveys share some of the same information. In the future, this may or may not happen.

Thursday, February 4, 2016

NHS data on age group X highest certificate at Census Tract level?


I’m looking for data from the 2011 National Household Survey which cross-tabulates age with education level at the Census Tract level.

I found the following data table (99-012-X2011055) on the Statistics Canada website which provides the data at the subdivision level, but not CT. The same table was on the DLI’s EFT site.
Highest Certificate, Diploma or Degree (7), Age Groups (8B), Major Field of Study - Classification of Instructional Programs (CIP) 2011 (14), Labour Force Status (8), Attendance at School (3) and Sex (3) for the Population Aged 15 Years and Over, in Private Households of Canada, Provinces, Territories, Census Divisions and Census Subdivisions, 2011 National Household Survey
Have I overlooked something, or is the data not available publicly or through the DLI?


There is the following table at the CT level for Highest Certificate, Diploma or Degree but it does not cross-tabulate with age.

Profile - Immigration and Ethnocultural Diversity, Aboriginal Peoples, Labour and Education, and Mobility and Migration for Census Metropolitan Areas, Tracted Census Agglomerations and Census Tracts, National Household Survey, 2011

There is also the NHS profile, but again, it only has an age breakdown of the highest certificate, diploma or degree by population 15+ and Total population aged 25 to 64 years by highest certificate, diploma or degree.

Downloadable at the CT level:

This would need to be done via a custom tabulation through the closest regional office.

Wednesday, February 3, 2016

Breast cancer surgeries in Canada


I have a student who is looking for detailed data on re-operation rates in patients who have undergone lumpectomies across Canada.


Regarding your question, breast cancer surgeries are most likely available in the Discharge Abstract Database. If the student wants 100% coverage (rather than the 10% randomized sample in the DLI), they should submit a data request directly to CIHI. If this data will be used towards a graduate degree, they may be eligible to receive the data under the Graduate Student Data Access Program (GSDAP).

Link to data request form:

Information regarding the GSDAP:

Tuesday, February 2, 2016

University and College Academic Staff System (UCASS)


Can anyone suggest where university statistics (all kinds) might show up now that UCASS is history?


Postsecondary enrolments by institution type, registration status, province and sex
(Both sexes)

Postsecondary Student Information System (PSIS) (

Monday, February 1, 2016

License required for the Human Dimensions Open Data Challenge?


SSHRC, Compute Canada and a few other organizations are partnering to encourage researchers in the social sciences to participate in the Human Dimensions Open Data Challenge. Their definition of Open Data being " Open data is the practice of making machine-readable data freely available to anyone to develop all kinds of new and useful solutions, products, and applications that surpass the initial value of the original data>'

In terms of encouraging/enabling participation in this challenge, what (if any) restrictions/conditions would participants need regarding using DLI data? I guess I'm really looking for clarification on the Open License.

Please see: DLI licensing and Statistics Canada's Open Data Licence.

· “They can use the PUMFs for statistical and research purposes but they cannot share the data files with non DLI members.· Postal code information may not be used for contractual or income-generating activity, and cannot be redistributed outside the DLI .· The data from CIHI cannot be shared with anyone outside DLI institutions nor can it be used for commercial purposes.· Browse through examples of prior licensing questions.”
May I suggest to recommend using CANSIM, census profiles, Summary tables for the Human Dimensions Open Data Challenge.