Tuesday, October 30, 2018

Postsecondary Student Information System - Information on Early Childhood Education

Question:
Background: I have had a request for The Postsecondary Student Information (PSIS).

Question: Does the Postsecondary Student Information system (PSIS) include National data on ECE (Early Childhood Education) programs and student numbers in such programs?

 I have found information relating to Early childhood educators and assistants in  Statistics Canada Data Table 37-10-0128-01 Number and percentage distribution of certificates granted by registered apprentices in apprenticeship program by major trade group and sex for Early childhood educators and assistants. However, the data in only available Ontario. The source of this data was the Registered Apprenticeship information System. See  Table  37-10-0118-01   Number and percentage distribution of apprenticeship program registrations by major trade group and sex

As a result, I ‘m starting to question if the PSIS even includes ECE data.

The access issue relates to a problem that I encountered trying to access the PSIS via WDS https://dli-idd.statcan.gc.ca/wds/ReportFolders/ReportFolders.aspx?IF_ActivePath=P,17351,20239&IF_Language=eng

When I clicked on the above link which I found on https://www.statcan.gc.ca/eng/dli/dli-products it generated [an] error message.

Answer:
I will follow up with subject matter regarding the ECE numbers in the PSIS data, however, in regards to the link through the WDS – unfortunately the Education Division had asked us a few months ago to remove all of the PSIS information off of the WDS due to some inaccuracies in the data. This would account for your access issues.

Here is the response from subject matter:

"PSIS data covers all college and university programs and, therefore, would include any program that could result in college or university credentials that would include ECE material.  For example, Classification of Instructional Programs (CIP) 13.1210 covers "Early childhood education and teaching" and cites Bachelor and Masters degrees as potential credentials associated with it.  CIP is one of the key variables included in PSIS.  (Unfortunately, the CIP-6 (digit) variable is assigned to a program based on the institutions program description and, therefore, may not be as accurately tied the CIP-6 value as researchers might expect.  Also, CIP-6 enrolment and graduate numbers are often very small at the individual institution level, resulting in inadequate data for detailed analysis.  In general, we highly recommend that researcher use CIP-2 or CIP-4 values instead.)"

The PSIS file can be found in the EFT at /MAD_DLI_IDD_DAM/Root/other_autres/5017_PSIS_SIEP/data

Follow Up Question:
The PSIS on the ETF doesn't appear to have any information on Early Childhood Education CIP Classification 13.1210 (more specifically CIP 13.1015 ECE). I have identified  Table: 37-10-0012-01 (formerly CANSIM 477-0030) as potential source of the data source. When I tried to customized the table using the Add/Remove data tab and narrow the results by credential type, i.e.,  Certificate and Diploma I do get a number, however, many institutions are offering degree programs in ECE.  Once I included degree program I get all type of education degree programs, not just the ECE programs. 


Would it require a custom tab to identify only the ECE  programs (certificate, diploma and degree). Am I missing a data source that provides this information?

Answer:

Subject matter has confirmed that there would unfortunately be a cost associated with identifying only the ECE programs (ie custom tab), at a minimum of $150.

Looking for "self-employed" variable for the General Social Survey, 2013 [Canada]: Cycle 27, Social Identity [version 2]

Question:
A researcher is looking for a self-employed variable in the PUMF for the GSS 27 Social Identity survey.  It’s there for the previous cycle, cycle 22, but we can’t find it for cycle 22.  Are we missing something …?

We also wondered …

  • When will there be a new cycle on Social Identity?  Is this documented anywhere?  The researcher would like to follow this.  We noticed the IMDB page still indicates 2013 as the latest year.
  • Would it be possible to see a release date for the next GSG on https://www.statcan.gc.ca/eng/dli/prod_date
Answer:

Here is the response I received from subject matter:

“You are correct, cycle 27 does not have a self-employed variable in the PUMF.  Subject matter will get back to us if they can find why it was left out.

The next iteration of Social Identity will go in the field in May 2020 and the first release(analytical) is expected in the Fall of 2021, with the PUMF following likely in the Spring of 2022.”

Historical Census population data by Census Tracts for 1941

Question:
A geography researcher is looking for historical Census population data by Census Tracts for 1941.

We have previously located:

1951                    https://archive.org/details/1951981951M5NO51953engfra
1956 &1961        https://archive.org/details/1961955281963engfra
1961                    http://publications.gc.ca/collections/collection_2017/statcan/CS95-541-1961.pdf

The researcher believes there must be Census population data by Census Tracts for 1941 based on the “Census Years” documentation about Census Tracts  at http://www12.statcan.gc.ca/census-recensement/2011/ref/dict/geo013-eng.cfm which indicates there were Census Tracts in 1941. 

Would it be possible to please request some help finding 1941 population data by Census Tracts, or to ask to what this documentation about Census Years is referring? 

Answer:
Here is the response from subject matter:

“The census tract program did start in 1941, however census tracts were called “social areas” when it started, and it seems that only Vancouver and Winnipeg were tracted at that time. Here is the link to that publication: https://archive.org/details/1941981941M32A161941ef

Are you looking for data from a specific area? I noticed the two first links are for Ottawa. This city was only first tracted in 1951 according to our publication “Census tract programme: A review 1941-1981”. I did find data by municipal ward for 1941, if it can be helpful at all: https://archive.org/details/1941981941P7NOA151941ef”

Monday, October 29, 2018

Canadian Work, Stress, and Health Study (CANWSH)

Question:
I am looking for  The Canadian Work, Stress, and Health Study (CANWSH).

I was told that this was a Statistics Canada survey but I haven’t been able to find it.

Any ideas?

Answer:
I don’t think it is [a Stats Canada Survey]. According to the article notes for “Control in the face of uncertainty” [http://journals.sagepub.com/doi/10.1177/0190272514546698],  the ”Canadian Work, Stress and Health Study [is] a national panel survey of Canadian labor force participants funded by the Canadian Institutes of Health Research to study work stress and its long-term consequences for workers and their families.”

It appears they ran their own survey. I would contact the researchers directly.

Thursday, October 25, 2018

Individual Annual Income Data

Question:
I have a researcher who is seeking to update data for a study that he has already completed. In the past he used data from Statistics Canada Census. The data was broken down by the following age groups


0 to 4 years
5 to 9 years
10 to 14
15 to 19
20 to 24
25 to 29
30 to 34
35 to 39
40 to 44
45 to 49
50 to 54
55 to 59
60 to 64
65 to 69
70 to 74
75 to 79
80 to 84
85+

He is looking for a similar age break down to make his previous data comparable to the current data.

He needs individual annual income data. He refers to as total income reported on income tax forms. I believe he is referring to Tax Filer data from CRA Income Tax Returns available from Statistics Canada.

The geography which he is seeking income data for B.C., and if possible for the Okanagan region. He indicated on the past he was able to access the data by CMA.

He has looked at Table 98-400-X2016110 https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/dt-td/Rp-eng.cfm?LANG=E&APATH=3&DETAIL=0&DIM=0&FL=A&FREE=0&GC=0&GID=0&GK=0&GRP=1&PID=110242&PRID=10&PTYPE=109445&S=0&SHOWALL=0&SUB=0&Temporal=2016&THEME=119&VID=0&VNAMEE=&VNAMEF=

He didn't find it useful because it only provided 7 age groups which is not enough comparative purposes for his research.

15 to 24 years
25 to 34 years
35 to 44 years
45 to 54 years
55 to 64 years
65 to 74 years
75 years and over

He indicated I could also use annual income data for the age groups of the primary household maintainer, but I can’t find this data either. Age groups from the 2011 census were:

Under 25 years
25 to 29 years
30 to 34 years
35 to 39 years
40 to 44 years
45 to 49 years
50 to 54 years
55 to 59 years
60 to 64 years
65 to 69 years
70 to 74 years
75 years and over

I sent him to Canada Table Tax filers and dependents 15 years of age and over with labour income https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=1110002301

I have told him that the best method to get exactly what you are looking for is to pay for a custom report on Income of Individuals 13C0015 from Statistics Canada. See https://www150.statcan.gc.ca/n1/en/catalogue/13C0015 details on how to order the report. There is a similar report for Households 13C0016 see https://www150.statcan.gc.ca/n1/en/catalogue/13C0016

I indicated the report characterize the Canadian population by income and demographics. Data may be requested by gender for marital status, age groups, counts by single year of age, sources of income, income distribution by age group, taxes paid, selected deductions and benefits, median employment income, median total income and median after-tax income, plus national and provincial indices of median total income. The statistics are derived primarily from the annual tax file provided by the Canada Revenue Agency.

Data for some geographic areas are available starting from 1986. The latest data (2016) can be requested for Canada, provinces and territories, federal electoral districts, economic regions, census divisions, census metropolitan areas, census agglomerations, census tracts and certain postal geographies.

The cost of each custom product is based on the time required to produce it according to the client's requirements. The hourly rate is $75.52. 

He didn’t seem to keen on the idea of paying for a custom report.

Do you have any additional ideas where he can access the needed data?

Answer from Subject Matter:
Did the researcher mention for which Census year he previously used the Statistics Canada Census data? Also, in the past did the user receive a custom table or did he use one of the standard products from the Census Program page? Finally, we received the following response back from subject matter: 

“I checked Census 2006 and 2011 and Okanagan was not a CMA. We had ‘Okanagan-Similkameen’, ‘Central Okanagan’ and ‘North Okanagan’ as Census Divisions.

Also for Census 2016 Okanagan is not a CMA but ‘Okanagan-Similkameen’, ‘Central Okanagan’ and ‘North Okanagan’ are the Census Divisions.


Here are some reasons why he can’t access the data he is looking for:

1)      For 2016 Census with ‘total income groups’ variable all standard tables we have is for the geographic hierarchy ‘Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations’. So he won’t find the geography he is looking for.

2)      Individual income information was compiled for the population aged 15 years and over (Income Reference Guide, Census of Population, 2016) so for there is no age group below 15 years.

3)      I checked all the standard tables for Census years 2011, 2006 and 2001 with ‘total income’ and ‘Age groups’. Please see the maximum grouping in cross tabulation below:

  • For 2001: Total Income Groups (22) in Constant (2000) Dollars, Sex (3), Age Groups (9A) and Marital Status (6) for Population 15 Years and Over, for Canada, Provinces and Territories, 1995 and 2000 - 20% Sample Data - Cat. No. 97F0020XCB2001040.

          2001 Census Data Products

  • For 2006: Total Income Groups (23) in Constant (2005) Dollars, Age Groups (7A), Highest Certificate, Diploma or Degree (5) and Sex (3) for the Population 15 Years and Over of Canada, Provinces, Territories, Census Metropolitan Areas and Census Agglomerations, 2000 and 2005 - 20% Sample Data
         2006 Census of Canada: Topic-based tabulations

  • For 2011: Selected Demographic, Sociocultural and Labour Characteristics (168), Income Statistics in 2010 (3B) and Total Income Groups (7) for the Population Aged 15 Years and Over in Private Households of Canada, Provinces, Territories and Census Metropolitan Areas, 2011 National Household Survey (table contains Age group (9))

          2011 National Household Survey: NHS Data tables

Based on his requirements a custom table is required.

The cost of each custom product is based on the time required to produce it according to the client's requirements. The hourly rate is $77.03 (http://icn-rci.statcan.ca/30/30a/30a_007-eng.html).

There is a base consultation fee of $519.95 for a minimal effort custom data request. The price is adjusted based on the specifications and degree of difficulty requested by the client.”

Follow-up Question:
Further of this email correspondence the researcher I’m working with used the following tables in his previous research:

Statistics Canada Data

2006 Census Tables
97-564-XCB2006006 (Income by industry)
97-559-XCB2006009 (Employment by industry)
94-579-XCB2006004 (Population by region)

2011 Census Tables
99-014-X2011044 (Income by industry)
99-014-X2011028 (household income)
99-012-X2011052 (Employment by industry)

Can’t seem to find the tables for income by age group.


Are there equivalent Tables from the 2016 Census?

Follow-up Answer from Subject Matter:
Statistics Canada Data



2006 Census Tables



97-564-XCB2006006 (Income by industry) AND 97-559-XCB2006009 (Employment by industry)

Industry - North American Industry Classification System (NAICS) 2012 (425), Employment Income Statistics (3), Highest Certificate, Diploma or Degree (7), Aboriginal Identity (9), Work Activity During the Reference Year (4), Age (5A) and Sex (3) for the Population Aged 15 Years and Over Who Worked in 2015 and Reported Employment Income in 2015, in Private Households of Canada, Provinces and Territories and Census Metropolitan Areas, 2016 Census - 25% Sample Data
[https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/dt-td/Rp-eng.cfm?LANG=E&APATH=7&DETAIL=0&DIM=0&FL=I&FREE=0&GC=0&GID=0&GK=0&GRP=1&PID=112128&PRID=10&PTYPE=109445&S=0&SHOWALL=0&SUB=0&Temporal=2016,2017&THEME=0&VID=0&VNAMEE=Industry%20-%20North%20American%20Industry%20Classification%20System%20%28NAICS%29%202012%20%28425%29&VNAMEF=Industrie%20-%20Syst%C3%A8me%20de%20classification%20des%20industries%20de%20l%27Am%C3%A9rique%20du%20Nord%20%28SCIAN%29%202012%20%28425%29]

94-579-XCB2006004 (Population by region)

2016 Census topic: Population and dwelling counts [https://www12.statcan.gc.ca/census-recensement/2016/rt-td/population-eng.cfm]

2011 Census Tables

99-014-X2011044 (Income by industry) AND 99-012-X2011052 (Employment by industry) 
            See first link above

99-014-X2011028 (household income) 
[https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/dt-td/Rp-eng.cfm?LANG=E&APATH=3&DETAIL=0&DIM=0&FL=A&FREE=0&GC=0&GID=0&GK=0&GRP=1&PID=110505&PRID=10&PTYPE=109445&S=0&SHOWALL=0&SUB=0&Temporal=2017&THEME=131&VID=0&VNAMEE=&VNAMEF=]

Household Income Statistics (3A), Structural Type of Dwelling (10) and Household Type Including Census Family Structure (11) for Private Households of Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations, 2016 Census - 25% Sample Data

Can’t seem to find the tables for income by age group.

Mother Tongue (10), Income Statistics (17), Highest Certificate, Diploma or Degree (15), Immigrant Status and Period of Immigration (10), Work Activity During the Reference Year (4A) and Sex and Age (15) for the Population Aged 15 Years and Over in Private Households of Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations, 2016 Census - 25% Sample Data
[https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/dt-td/Rp-eng.cfm?LANG=E&APATH=7&DETAIL=0&DIM=0&FL=I&FREE=0&GC=0&GID=0&GK=0&GRP=1&PID=111821&PRID=10&PTYPE=109445&S=0&SHOWALL=0&SUB=0&Temporal=2016,2017&THEME=0&VID=0&VNAMEE=Income%20statistics%20%2817%29&VNAMEF=Statistiques%20du%20revenu%20%2817%29]

Wednesday, October 24, 2018

Households and the Environment Data Dictionary

Question:
I have a researcher using the 2015 Households and the Environment Survey (HES), and she’s wondering if there are any more fulsome explanations of some of the “other” responses in certain categories.

The examples she’s given are:

1)    For GDWELCOD on page 12, it states 2 options for dwelling i.e. other and apartment. There is no explanation of "other".

2)    On page 143 of data dictionary: NN_Q08BY: states other for an outdoor activity with no explanation of what other means. Is there anyone I can call to inquire?

Are there any further classifications of “other” available for these variables?

Answer:
GDWELCOD
The “other” category includes the following dwelling types:

  • Single-detached
  • Double
  • Row or Terrace
  • Duplex
  • Mobile home
  • “Other”

NN_Q08BY

The outdoor activities question was asked as an open-ended question and responses coded post-collection. The “other” category for the outdoor activities is a catch-all for any infrequently-reported outdoor activities that do not fall into one of the other categories of activities. Unfortunately, there isn’t any additional detail available for the activities in this category.

Disability Benefits

Question:
A PhD student would like to get microdata that would show, in addition to demographic and socio-economic characteristics, individual income received from all forms of disability insurances (either private insurance, employer insurance  or government programs). I could not locate a dataset, at least not a PUMF. The Canadian Income Survey has a variable that shows federal and Quebec pension plans revenues including disability benefits, but it is one aggregate figure, there is not a separate amount for disability-related income. In any case that would not show  private or employer disability insurance benefits.

Am I missing a data source? If not, would there be a Masterfile at the RDC (maybe the Longitudinal Administrative Databank) that would have that type of information?

Answer from Subject Matter:
“The LAD does not have any information on private or employer disability insurance benefits.

We simply report the total earnings on employees’ T4, some of which may be related to disability.

The LAD does have three disability-related income variables:

  • CPP/QPP disability benefits included in income (DSBCQ)
  • Registered disability savings plan (RDSP_)
  • Workers’ compensation payments (WKCPY)
Another factor to keep in mind is that lump-sum insurance settlements for disability claims would not be conceptually treated as income but rather a wealth transfer.

There is no tax microdata available in RDCs, since T1FF is not available in RDCs it is not possible to use it for this project. ”

Follow-up Question:
Would it be possible to get a get a complete description of the 3 variables listed ? I do not believe that there is a data dictionary publicly available for the LAD. I want to make sure that the variables actually provide an income figure and that it is specific to disability. For instance, I want to confirm that the DSBCQ variable only lists the  disability-related part of the CPP/QPP benefits. And as for the RDSP, are we talking about income or contributions to the plan?

Answer from DLI List:
Take a look at the following link for the LAD, 2015 variable definitions (data dictionary).

https://www150.statcan.gc.ca/n1/pub/12-585-x/2017000/def/def_a-eng.htm

You can find the LAD three disability-related income variables:

  • CPP/QPP disability benefits included in income (DSBCQ)  - under C, almost at the bottom of the page
  • Registered disability savings plan (RDSP_) - under R
  • Workers’ compensation payments (WKCPY) - W, at the top of the page
Answer from Subject Matter:

“Here is a link to the LAD documentation in pdf format in order to obtain a description of CPP/QPP disability benefits included in income (DSBCQ), Registered disability savings plan (RDSP), and Workers’ compensation payments (WKCPY): https://www150.statcan.gc.ca/n1/en/pub/12-585-x/12-585-x2017000-eng.pdf?st=ZSSjNqKF

The above variables actually provide an income figure and specifically to disability; we confirm that the DSBCQ variable only lists the disability-related part of the CPP/QPP benefits; and that RDSP refers to income or contributions to the plan (see documentation).

Also:

The DSBCQ variable is taken from line 152 of the T1 and only shows the disability-related part of the CPP/QPP benefits.

The RDSP variable is taken from line 125 of the T1 and refers to income from an RDSP.”

Follow-up Questions:
 I had a look at the user guide for the Longitudinal and International Study of Adults  (LISA) user guide. It looks like this survey would be more suited to my PhD students as it has richer demographic, labour  and educational information as well as module on  disability.

According to the user guide, the income information comes from the T1FF and T4 files, but there is no detailed description of the income (or other) variables) and the data dictionary does not seem to be readily available.

I would like to know if some income variables include disability-related figures, either similar variables than those available in the LAD or other ones, as long as they are specific to disability benefits. Also, it would be great if the data dictionary was available.

Follow-up Answer:
“As per our LISA Section:

LISA would be a great data source for the client’s student’s needs. In addition to the module on self-reported disability, LISA links to administrative data, including the T1FF. As a result, we would have respondent-level data on disability-related benefits (including DSBCQ, RDSP, and WKCPY that were mentioned in the email below). Unfortunately, we do not have information on private or employer disability insurance benefits.

The client should note that LISA is a longitudinal study, and therefore, the data are most useful if users are looking to perform longitudinal analyses. The weighted output represents the original LISA population of 2012 (Canadians aged 15 years and older living in the 10 provinces).

The client should also be aware that the latest LISA data (Wave 3; 2016) is set to be released before the end of the calendar year. Currently Wave 1 (2012) and Wave 2 (2014) are available to data users.

Finally, the nofreq codebooks (data dictionaries) that the client may find useful are attached. One is for the main LISA survey (Wave 2; 2014), and the other is our T1FF codebook, which outlines the variables that are provided through data linkage.”

Monday, October 15, 2018

Individual income by province for 1925 and 1930

Question:
A researcher is looking for average  individual income by province for 1925 and 1930.

I have consulted Historical Statistics of Canada but the only table that seemed to give that information (table E49: Average weekly wages and salaries, industrial composite, by province, 1939 to 1975) only goes back to 1939.  

The main source of income data, according to Historical Statistics, seems to be  the Survey of employment, payrolls and man-hours but that survey only started in 1941.

 I found other tables (still in Historical Statistics of Canada) that show the aggregate amount of salary and related income by province (E1: Wages, salaries and supplementary labour income, by province, 1926 to 1975) but that does not help with average individual income.

So other than census data (that would only cover 1921 and 1931) , is there any source of annual salary data for those years?

Answer from DLI:
This is an interesting question. The monthly microfiche series 72-002a Employment, Earnings and Hours = Emploi, gains et durée du travail (Dec. 1922- ) may be useful.   We have these in microfiches here and I have a weird feeling that I used these a long time ago.  I can’t find the digitized version at publications.gc.ca for some reason.

While I found tables in the Canada Year Book of total amount of individual income assessed by occupations (including number of individual taxpayers), e.g., page 877, Canada Year Book 1933 …there was nothing comparable by provinces (only a total of corporate and individual income by province).

Answer from Subject Matter:
Subject matter has let us know that they do not have any other data sources for individual income for 1925 and 1930.  The only other possible source might be the Canada Year Book.

If you’re interested in the Canada Year Book, the collection from 1867 to 1967 is available on the Alberta Mirror EFT site. Files exist in PDF in both English and French. 

Thursday, October 11, 2018

Discussion on Cataloging

Question:
Just checked our library catalogue record for Canadian Business Patterns which now needs to reflect the new title Canadian Business Counts.  So far so good.

However, this reminds me of a long standing problem for me which relates to my getting DLI data records into our catalogue. This is an internal matter but advice from others on how they connect with their technical services area may be helpful.

The U of R catalogue record takes the user to .  This is all good, but it is not the whole story.  It should also direct the user to CHASS (another outdated name?) or Nesstar and it seems to me that it should also indicate that mediated access is available for obtaining the data from the EFT site. I would like to see all of this spelled out which would include a statement as to what the differences between the data on the EFT and the extraction sites.  Unfortunately, I can't remember.  Does it pertain to the level of geography provideid?

Additionally, I know some libraries also point to this:  https://www150.statcan.gc.ca/n1/en/catalogue/61C0025 --  Customized extractions from the Canadian Business Patterns

I think, little late for ACCOLEDS this year, that we should perhaps have a technical services session covering this topic at a future training event.

Answer from DLI Admin:
I can speak to the difference between the data access methods on the DLI side.

The data on the EFT is the whole DLI collection. Each safe (depending on the license signed by the subscribing institution) gives access to the following products:

1.       MAD_PUMF_FMGD_DAM - Survey PUMFs and metadata
2.       MAD_DLI_IDD_DAM - DLI annual reports, DLI training materials, CD-ROM data products, Geography files, Census files and more
3.       MAD-PCCF_FCCP_DAM - Postal Code Conversion Files (PCCF), Postal Code Conversion Files Plus (PCCF+) and Postal Code Federal Riding Files (PCFRF)
4.       MAD_CIHI_ICIS_DAM - Discharge Abstract Database (DAD) from the Canadian Institute of Health Information (CIHI)
5.       MAD_SPSDM_BDMSPS_DAM - Social Policy Simulation Database and Model files (SPSDM)

More information about the contents of each safe can be found in Section 7: Accessing and Citing DLI Data of the DLI Survival Guide.

The DLI Nesstar site is a web-based data portal where users can search and identify variables of interest on the microdata files and determine whether a PUMF or a master file would best fit their research needs. The DLI’s version of Nessar houses public use microdata files and the metadata from Research Data Centre (RDC) master files. The server itself can house different forms of data. I believe that ODESI houses more of a variety such as public-opinion polls and such.

Everything that is available on the DLI Nesstar site is on the EFT. The DLI Nesstar site is just another means to explore and analyze data through a web browser.


I’ll pass on the idea of a future session on difference technical services available through the community to the rest of the DLI team. Thank you for bringing this to our attention!

Answer for DLI Member:
Perhaps the challenge also stems from the data being available or not available in multiple places, some of which are mediated, others are not, and, THE LACK OF AUTHORITY METADATA (machine-readable) for these data products. This is also compounded by the inconsistency between the public facing content on the STC website, and what is available through the DLI. In some cases, if you do a comparison, data found in the DLI to not match the same product records on the website. There are a whole bunch more problems here which have likely lead each institution to do their own thing.

I wholly agree that the metadata problems we encounter for StatCan data is not unique to just you, it would be interesting to explore this further, but perhaps as you say a whole session could be dedicated to this. At SP and in Ontario we have explored providing MARC records, OAI endpoint connections, and APIs for searching data available in ODESI, but again that is only for ODESI copies of DLI data. API integration into your catalogue is probably a good approach at this point, and with new discovery platforms offering custom API integration this is getting more and more sophisticated.

Perhaps even the national research data discovery platform FRDR can play a role in all of this?  Maybe not right now for DLI data, given the duplication….? But regardless we should look to some kind of solution as a group.

Consumer Price Index (CPI) PUMF files

Question:
We have a researcher that is looking for CPI PUMF files. Do they exist? How small on geographical level can you get the CPI values:

More specifically:
"CPI Values: if at all possible, I am looking for microdata on individual foodstuffs used in the construction of CPI measures by Stats Can. If that is not possible, I would be interested in CPI “values” specifically as they pertain to foodstuffs (I suppose for whatever “basket” of foodstuffs Stats Can uses. If foodstuff CPI values are all that is available, I would like to know the steps used in constructing them, as well as which “baskets” of goods were used. 
  
Geographies/Geographic “Resolution”: I am looking for individualized values for all upper tier municipalities in Ontario, Canada (I believe Stats Can refers to them as Census Agglomeration Part / Metropolitan Area / Area Part?). 
  
Time: widest possible"

Answer: 
Subject matter has suggested that the researcher accesses the data from CEDR. There is some availability of some cities. The information for food is available there.

For your reference, here is some information about CDER :

Proposal requirements: http://www.statcan.gc.ca/eng/cder/prop and http://www.statcan.gc.ca/eng/cder/price.

Any proposal for submission must have sufficient details, so as to determine the feasibility of the project. This will include a data perspective, costs, and professional merit. Note that CDER only considers research projects that are not descriptive in nature.

CDER is run on a cost-recovery basis; as such, all researchers must be able to cover all project costs, including the cost of having their project proposals peer and committee-reviewed.

The costs of a project depend on a variety of factors such as

1.length of time required to access the CDER facilities
2.whether the data set exists or needs to be developed
3.how much output needs to be reviewed for confidentiality. 

This means that there is no standard project cost. Each project will be reviewed by CDER to determine the individual cost. Here are some elements to consider for a simple project:

•whether a project makes use of an existing database
•whether the project can be completed in 3 months (66 days)
•if the expected output of the project would take no more than 2 days for a Statistics Canada analyst to review for confidentiality issues
•if there are project submission fees (see below)
•whether there is a need for housing the researcher(s) and providing them with access to CDER facilities (office workspace, workstation, server, data storage, etc.), which is the largest element affecting the cost

Based on current costs, a simple project, such as the aforementioned would cost approximately $7,200.00.

Wednesday, October 10, 2018

Stats Can Postal Code Conversion and Federal Riding data Files

Question:
Our data team has discovered a rather large discrepancy between the data sets that we received in 2017 and 2018.

In 2017 (25-Aug-2017) we have received the following files:

  * pccfNat_AUG15_fccpNat.txt
  * pcfrfNatFED2013_AUG15_fcpcefNatCEF2013.txt

In 2018 (25-Jun-2018) we have received the following files:

  * pccfNat_fccpNat_062017.txt
  * pcfrf_NatFED2013_062017_fcpcefNatCÉF2013.txt

We have discovered that unusually  large number of postal codes is not getting a match in our system, that prompted us to compare 2017 and 2018 data files. We have discovered that 16,147 postal codes exist in 2017 data file but do not exist in 2018 data file. We have spot checked 74 postal codes on the Canada Post’s web site and have received positive confirmation for 50 of them, meaning that in 50 cases Canada Post web  site would return a list of addresses linked to the given postal code.

To this message is attached an excel file that lists all the postal codes that exist in 2017 file but do not exist in 2018 file.

Answer:
We received the following response from subject matter:

“After doing a little analysis with the provided list of postal code, 10419 postal code from your list are retired.  It is important to note that the PCCF based on 2016 geographies does not contain the retired postal code that retired before 2016.

Postal codes are intended for the distribution of mail by Canada Post. The files undergo changes on a regular basis. In some instances, postal codes are retired and re-birthed with a different deliver mode type or the address information for postal codes is changed, which can break our linkage for our geocoding process.

Many postal code did not go through our geocoding process, therefore did not make it to the PCCF file. Every release we are working on getting these postal code back in the product file. The vast majority of these linkages are created in an automated fashion at the dissemination area, dissemination block or block face level geographies.  Records that do not link are output for manual geocoding.  These records that were previously manually geocoded to the three principle geographies, will now be linked to the census subdivision geography only. They will include only postal codes that do not already appear (already have records) on the main processing table, PCINFO.  These records (postal codes) once linked to the CSD level geography, will then be appended to the PCCF and PCFRF product files.  They will not be written back to PCINFO, as the processing system (PCUS) will attempt to geocode these postal codes with each subsequent month of processing CPC data.

We are working diligently to continually improve the product with each release quarterly as we are working in improving our production tool. ”

Research & Development and Money Spent in Energy Efficiency Programs

Question:
I have a client who is looking for the following information: 

“I was trying to find at StatsCan if they have the numbers of how much of R&D is being spent in Energy Efficiency programs and what programs, if possible, segregated by province. I found a general expenditures table (https://www150.statcan.gc.ca/n1/daily-quotidien/180828/cg-b001-eng.htm), but I wanted to find something more specific on the provinces and programs as well. Do you think this kind of information is available?”


Answer from DLI:

Good question.  I hope StatCan is gathering this from different sources.  I noticed that the federal government reports on this, but this excludes non federal government R & D investments.

In case you run into roadblocks, Natural Resources Canada’s Office of Energy Research and Development (OERD) is the Government of Canada's co-ordinator of energy research and development (R&D) activities. [https://www.nrcan.gc.ca/energy/offices-labs/oerd/5711] Please see the reports / publications to the right.  This will exclude other funders. 


I don’t know if this next source is overly narrow I think because it includes only NRCan as a funder but it breaks down into technology areas [https://www.nrcan.gc.ca/energy/funding/21146] and by Locations (provinces).  On the other hand, I’m not sure there are other targeted energy efficiency R & D programs. 


Answer from Subject Matter:

“No – we do not provide any detail on energy-related R&D expenditures at the provincial/territorial level.

Table 27-10-0347-01 does provide a breakdown by area of technology.  This is the lowest level of detail we provide for energy-related R&D expenditure data.”

Friday, October 5, 2018

Commercial Real Estate

Question:
I have a researcher looking for a dataset which gives the the value or rent of commercial properties by CMA or Census Subdivision from 2006 to present. An example would be annual average rent of office space per sqf. From Statistics Canada I think that the Commercial Rents Services Price Index may work, but I can only find data for Canada and not by any smaller geographies (ex. Table 18-10-0065-02). Would it be possible to get this data by CMA or Subdivision? Would any one from the community be able to suggest any alternative data sources?

Answers from DLI List:
I hope the Commercial Rents Services Price Index works for your data question.  If not, we might to look at CREA or BOMA[http://bomacanada.ca/] for this?

You might have the researcher contact the Toronto Real Estate Board to see if they can provide some data: http://www.trebhome.com/

I occasionally wander over to this site, to try some of the non-government sources within, for questions such as these:

"Global Property Guide:  Financial Information for Residential Property Buyers"
< https://www.globalpropertyguide.com/North-America/Canada/Useful-Links/Economics,-Statistics,-Property-Price-History;1  >

** Can't vouch for the veracity of the hosted sites and/or the comprehensiveness of the list, but it does give lots of places to explore!  **

Answer from Subject Matter:
“The Commercial Rents Services Price Index (CRSPI) measures price changes of commercial rents over time.

It is published at the national level for all buildings combined (Office buildings, Retail buildings, and Industrial buildings and warehouses).

Prices per square foot are not published at the CMA level.

 However, to meet you data needs (CMA level price per square foot for office space), please refer to CBRE (www.CBRE.ca, under research centre) or Collier International (www.collierscanada.com, under research section).”

2016 Census Availability

Question:
I’m looking for Census data but I completely confused by the EFT, and I have no clue what’s actually available at this time. Here is my researcher’s request:

I also need the Census Data (short+long form) for 2016, if possible for all four levels (i.e. FSA (FSA), Census Tract (CT), Census Subdivision (CSD) and Dissemination Areas (DA)).

Can anyone point me in the right direction?

Answer:
Assuming that the researcher isn't looking for a PUMF or to create cross-tabulations, you can download this data fairly easily from Stat Can's website:

  • Go to the census profile for 2016.
  • Click on "Download Census Profile data".
  • You can download all the data for a given level of geography in CSV (Excel compatible) format, or TAB or XML (anyone here ever use these?), or IVT (Beyond 20/20).

If it's a researcher that will do a lot of work with census data, I recommend they use Beyond 20/20 to shape the data the way they need, and then export right back into an Excel compatible format. I have a slightly outdated slides/screen recording of Beyond 20/20 at the top of my tutorials page (I use a CANSIM table to illustrate Beyond 20/20 functions, which is no longer possible), a longer slide deck (also using a CANSIM table as an example), and a fairly up to date exercise using a census example.

Thursday, October 4, 2018

CCHS definition of "institutionalized resident"

Question:
I'm strangely stumped: I'm unable to determine what the CCHS means by "institutionalized population" (exclusion criterion). Is it the same as "Institutional resident" as defined by the Census (https://www12.statcan.gc.ca/census-recensement/2016/ref/dict/pop053-eng.cfm)?

Answer:
We received the following response from subject matter:

“Yes it is. Persons living in institutions (for example, inmates of penal institutions and patients in hospitals or nursing homes).”

Follow Up Question:
Could I get a more precise definition of institutional as it pertains to nursing homes? I'm under the impression there's a spectrum of assisted living facilities for older adults. The researcher on whose behalf I'm asking is particularly interested in older adults and long-term care so I imagine she'll be looking for more detail.

Follow Up Answer:
We received the following response from subject matter:

“We don’t have a definition for that. We don’t currently have a survey collecting data on nursing homes. The Canadian Institute for Health Information (CIHI) collects nursing home information, they may have a definition for them.” CIHI’s general inquiry email address is: help@cihi.ca.

Wednesday, October 3, 2018

Commercial Real Estate

Question:
I have a researcher looking for a dataset which gives the the value or rent of commercial properties by CMA or Census Subdivision from 2006 to present. An example would be annual average rent of office space per sqf. From Statistics Canada I think that the Commercial Rents Services Price Index may work, but I can only find data for Canada and not by any smaller geographies (ex. Table 18-10-0065-02). Would it be possible to get this data by CMA or Subdivision? Would any one from the community be able to suggest any alternative data sources?

Answers from the DLI List:
- Try BOMA Canada (http://bomacanada.ca/). A quick check of the website didn’t reveal any actual numbers, but they do say that they collect statistics!  

- I hope the Commercial Rents Services Price Index works for your data question. If not, we might to look at CREA or BOMA for this?

- You might have the researcher contact the Toronto Real Estate Board to see if they can provide some data: http://www.trebhome.com/

Answer from Subject Matter:
“The Commercial Rents Services Price Index (CRSPI) measures price changes of commercial rents over time.

It is published at the national level for all buildings combined (Office buildings, Retail buildings, and Industrial buildings and warehouses).

Prices per square foot are not published at the CMA level.

However, to meet you data needs (CMA level price per square foot for office space), please refer to CBRE (www.CBRE.ca, under research centre) or Collier International (www.collierscanada.com, under research section).”

Unsuppressed BERD data by province

Question:
I have a researcher looking for unsuppressed data from the Annual Survey of Research and Development in Canadian Industry, and the Survey of Innovation and Business Strategy. Specifically, they are looking at business expenditure on R&D by industry for Canada and the provinces (similar to table 27-10-0341-01 / CANSIM table 358-0518), in hopes of comparing BERD within each industry across provinces and territories. Ideally they are hoping for the most recent 5 years of data, but if necessary they would sacrifice recent-ness for complete/unsuppressed data. I’m wondering if access to this data is possible, and via what channels. If I can provide any more details please let me know.

Answer from Subject Matter:
“The most recent year of data available for table 27-10-0341-01 / CANSIM table 358-0518 is for reference year 2016.  This data has been published and is currently available in the referenced table.  If the client requires 5 years of data, they can reference the archived CANSIM table 3580161 which contains similar information for years 2013 and prior.  It should be noted that the industry groupings were changed from 46 unique groups to 57 unique groups in 2014, so there may be some mapping required.  However, all the information required for that mapping is available in the disseminated tables. We are unable to provide estimates that have not been run through our confidentiality system.”

Historical data on cigarette consumption

Question:
I have a researcher looking for data on cigarette consumption in Canada from 1900-1965. Would prefer individual consumption but sales number would also work if that’s all that’s available.

We’ve already found some information in the Canada year book series, but does anyone have ideas of elsewhere this information might be?

Answer from DLI:
I came across this source:

Forey, B., Hamling, J., Hamling J., Thornton, A., & Lee, P. (2006, rev. 2012). Canada. International smoking statistics: A collection of worldwide historical data (Web ed.). Retrieved from http://www.pnlee.co.uk/Downloads/ISS/ISS-Canada_120111.pdf

·         see tables 1 and 2 which have stats from 1924 - 2010 ​

·         ​seems extremely authoritative and well-documented

Not sure about earlier years.

Answer from Subject Matter:
CTADS seems to have relevant information, however the reference periods only date back to the year 2000. SHS also seems to have relevant information, however, the reference periods only date back to the year 1999. I inquired with the subject matter areas responsible and received the following responses:

“Unfortunately, none of the surveys we have conducted go back that far… I took a look at Health Canada’s site as I thought that perhaps CADUMS would have a few more years of data but I couldn’t find anything that would clearly support this researcher’s endeavours.  I would suggest that the researcher contact HC to see if perhaps they have data more readily available.  The contact listed on the CADUMS site is:

For more information about the survey and its results, please write to the Office of Drugs and Alcohol Research and Surveillance, Controlled Substances and Tobacco Directorate, Health Canada, 150 Tunney's Pasture Driveway, Address Locator 0301A, Ottawa, ON, K1A 0K9, or send an e-mail request to CADUMS-ESCCAD@hc-sc.gc.ca.”

And

“Regrettably I have no recommendation - the SHS is limited in its timeframe scope such that it doesn’t touch upon 1900-1965; and as far as I am aware, the only available information in this regarding is sourced from the Canadian Year Book Series.”

We also received the following recommendation:

“Would CAT 11-512 be more detailed than the Canada Year Book?

If you are interested in FMX sources prior 1999, enclosed is a list of publications below.
Note: Our library have scanned the STATCAN publications.


FMX HISTORICAL YEARS

CAT xx-xxx: Family Income and Expenditure in Canada, 1937-1938 (12 cities)

CAT 62-513: Canadian Non-Farm Expenditure, 1947-48 (National)

With supplementary Food Data, 1948-1949

CAT 62-509: City Family Expenditure, 1953 (5 cities)

CAT 62-510: City Family Expenditure, 1955 (7 cities)

CAT 62-517: City Family Expenditure, 1957 (9 cities)

CAT 62-523: Farm Living Expenditure, 1958 (Farm, National)

CAT 62-521: Urban Family Expenditure, 1959 (60 urban centres)

CAT 62-525: Urban Family Expenditure, 1962 (7 cities)

CAT 62-527: Urban Family Expenditure, 1964 (11 cities)

CAT 62-530: Urban Family Expenditure, 1967 (11 cities)

CAT 62-535: Family Expenditure in Canada, 1969, VOL. I (National)

CAT 62-536: Family Expenditure in Canada, 1969, VOL. II (National)

CAT 62-537: Family Expenditure in Canada, 1969, VOL. III (National)

CAT 62-541: Urban Family Expenditure, 1972 (10 cities)

CAT 62-544: Urban Family Expenditure, 1974 (14 cities)

CAT 62-547: Urban Family Expenditure, 1976 (8 cities)

CAT 62-549: Family Expenditure in Canada, 1978, VOL. I (National)

CAT 62-550: Family Expenditure in Canada, 1978, VOL. II (National)

CAT 62-551: Family Expenditure in Canada, 1978, VOL. II (National)

CAT 62-555: Family Expenditure in Canada, 1982, (National)

CAT 62-555: Family Expenditure in Canada, 1984 (17 cities)

CAT 62-555: Family Expenditure in Canada, 1990 (17 cities)

CAT 62-555: Family Expenditure in Canada, 1992 (National)

CAT 62-555: Family Expenditure in Canada, 1996 (National)

CAT 62-203: Spending Patterns in Canada, 1997, (National, Provincial)

HISTORICAL FAMILY FOOD EXPENDITURE

CAT 62-513: Canadian Non-Farm Expenditure, 1947-48 (National)

With supplementary Food Data, 1948-1949

CAT. 62-511: Urban Family Food Expenditure, 1953 (5 cities)

CAT. 62-512: Urban Family Food Expenditure, 1955 (5 cities)

CAT. 62-524: Urban Family Food Expenditure, 1962 (7 cities)

CAT. 62-531: Family Food Expenditure in Canada, 1969, (National VOL. I)

CAT. 62-532: Family Food Expenditure in Canada, 1969, (National VOL. II)

CAT. 62-542: Urban Family Food Expenditure, 1974 (14 cities)

CAT. 62-545: Urban Family Food Expenditure, 1976 (8 cities)

CAT. 62-548: Urban Family Food Expenditure, 1978 (16 cities)

CAT. 62-554: Family Food Expenditure in Canada, 1982 (National)

CAT. 62-554: Family Food Expenditure in Canada, 1984 (17 cities)

CAT. 62-554: Family Food Expenditure in Canada, 1986 (National)

CAT. 62-554: Family Food Expenditure in Canada, 1990 (17 cities)

CAT. 62-554: Family Food Expenditure in Canada, 1992 (National)

CAT. 62-554: Family Food Expenditure in Canada, 1996 (National)

CAT. 62-554: Family Food Expenditure in Canada, 2001 (National)

Tuesday, October 2, 2018

Historical unemployment rates by UI/EI region

Question:
I am helping a researcher access Historical unemployment rates by UI/EI region from 1991 to July 1996. There’s a great website on the StatCan Website showing this information http://srv129.services.gc.ca/regions_ae/eng/rates_hist.aspx from 1996 onwards. I am wondering if this data is from the LFS –there’s no source information anywhere on the page.

Wondering if anyone has tips for accessing unemployment rates based upon the employment insurance (EI) economic regions for the period 1991 and July 1996. Prior to 1996 the Economic Region was referred to as subprovincial region https://www150.statcan.gc.ca/n1/pub/92-195-x/2011001/geo/er-re/er-re-eng.htm (although I don’t know if they are equivalent). I couldn’t find any documentation online for this census 1991 and will look in the paper tomorrow.

Some questions:

1.       Are these tables prior to 1996. If so, would they be available at the subprovincial region level?

2.       I would be interested in getting a quote for a custom tab for 1991 to July 1996 based on economic region, if possible.

3.       Source of the above table http://srv129.services.gc.ca/regions_ae/eng/rates_hist.aspx (believe the user received the link from StatCan help desk, it’s an odd url with no breadcrumbs)

Answer:
We received the following response from subject matter:

“The unemployment rate in Canada by EI Economic Regions are based on the LFS data and produced for the EI program (ESDC).

We do not produce custom data by EI Economic Regions, the only available data for which the Labour Statistics Division is responsible, pertaining to the EI Economic Regions, can be found online in table: 14-10-0354-01.

I recommend contacting Employment and Social Development Canada as they are responsible for the EI program as well as the referenced webpages containing the historical rates.”

Monday, October 1, 2018

CTs for NHS

Question:
Does anyone know where I might hunt down the NHS for Ontario that has all variables at the CT level? I found the Provincial/CD/CSD/DA file, but that’s it.


Any guidance? My researcher wants CT!

Answer:
The complete NHS Profile at CT level is available from Stat Can Census Program website at https://www12.statcan.gc.ca/nhs-enm/2011/dp-pd/prof/details/download-telecharger/comprehensive/comp-csv-tab-nhs-enm.cfm?Lang=E. You’ll find the Ontario data file when you unzip the downloaded CSV or TAB file. The IVT (Beyond 20/20) files are available at https://www12.statcan.gc.ca/nhs-enm/2011/dp-pd/prof/details/download-telecharger/comprehensive/comp-ivt-xml-nhs-enm.cfm?Lang=E.