Tuesday, June 30, 2015

GSS 27 Release to DLI

Question

A researcher got confirmation from Stat Can that the GSS cycle 27 was released yesterday. Is there an ETA for the file to be available for download from the DLI EFT? I assume that will be SAS format?

Answer

Yesterday we received the raw data files from subject matter, I will verify and stage the files in SAS, SPSS and STATA today and will try to have it loaded and released to the DLI community by the end of the day.

Forestry Employment

Question

I have a grad student looking for annual employment and wage figures for the forestry industry broken down by gender.

The LFS tables have the gender break down she wants, but lump together forestry and related activities, or include forestry with other resource-based industries. I found several tables based on SEPH which provide more detail about specific forestry activities (logging, reforestation, etc.) but aren’t broken down by gender.

Are there any other sources she might look at?

Answer

There's one recent project that involved pulling employment figures at the 4 digit NAICS level from the LFS. The results are here: http://hdl.handle.net/10864/10949

Have a look at the LFS PUMF, it has 43 groups of NAICS (eg: Forestry and Logging) and gender variables.

I also looked at the Canadian Income Survey (CIS) and Survey of Labour and Income Dynamics (SLID), in addition to the already mentioned SEPH, and could not locate public standard tables. Potentially a custom tabulation could provide the information required, if interested in inquiring about a cost estimate, please advise.

Monday, June 29, 2015

New updates to the DLI training Repository: DLI webinar on CHMS

New content has been added: Canadian Health Measures Survey (CHMS) [2015]

The Canadian Health Measures Survey (CHMS) aims to collect important health information through a household interview and direct physical measures at a mobile examination centre (MEC), sometimes referred to as a mobile clinic. During the webinar, analysts presented a general overview of the CHMS, followed by information related to the CHMS biobank.

The DLI Training Repository is a valuable resource, containing training sessions and workshop presentations from the DLI, and from other national and international conferences over the years.

Thursday, June 25, 2015

Data Analytics

Question

Can someone help me determine what the NAICS code for data analytics is? The closest I've come up with is 518210, but I don't think it is quite right.

Answer

I’ve looked up the NAICS codes for a few data analytics companies on Mergent quickly.

Data Analytics is one of those industries, I think, where not just one NAICS code fits. In terms of the big data analytics companies, most are classified under 518210, but I think this refers more towards companies that providing data hosting related services, such as data storage or data processing services, not necessarily the analysis aspect itself.

Most major data analytics companies are also classified under NAICS 511210 – Software Publishers (if it’s proprietary software that they sell or use), and often a secondary NAICS of 541511 – Customer Computer Programming Services, and 541512 – Computer Systems Design Services

Data Analytics is also often interchanged with Business Analytics. For Business Analytics, there is sometimes a qualifier used in regards to the Software Publishers NAICS of 51121c. This is how IBIS World classifies the NAICS for their Business Analytics & Enterprise Software Publishing, for example: <https://www.ibisworld.com/industry/default.aspx?indid=1989>

But I think the big division to be aware of regarding Data Analytics is whether or not the company is involved in creating or selling software related to data analysis, in which case it would fall more towards the Software Publishers NAICS of 511210, or if the company uses existing software to perform data analysis, in which case it would fall more towards the Customer Computer Programming Services NAICS of 541511.

2006 Census Family Average Income



Question

A researcher needs 2006 Census family "average" income at CT level for Toronto. I have only been able to find 2006 Census family "median" income through Census tract (CT) profiles webpage at http://www12.statcan.ca/census-recensement/2006/dp-pd/prof/92-597/index.cfm?Lang=E. This income category is not included in any 2006 Census Area Profiles tables, as other income categories are. Is there any reason for that? Please let me know where can I find 2006 Census family average income data at CT level.

Answer

I found this in separate profiles for census tracts - http://search1.odesi.ca/#/details?uri=%2Fodesi%2Fcen-95F0495-E-2006-profb20.xml , not sure why the data were released under “Addendum: Profile of ethnic origin and census family income” was released like that, maybe others will have more info :

Monday, June 22, 2015

PCCF and PCCF+

Question

I have a two questions about the PCCF and the PCCF+.

The most recent version I could find of the PCCF is using postal codes from November 2014 and the most recent version of the PCCF+ is using postal codes from June 2013. I’m not sure if either product eventually shows up in the list of tentative release dates for DLI products.

Also, the most recent version of the PCCF+ on the DLI FTP is version “5k” but the RDCs are already at version “6a”.

Question 1: when can we expect an update for either products?
Question 2: could we get the most recent version of the PCCF+ loaded in the DLI FTP?

Answer

The most updated released product for the

PCCF: /MAD_DLI_PCCF/Root/2015/data/pccfNat_nov2014_fccpNat

Released in April 2015. See attached. I will inquire when the next release will be available.

For PCCF+: /MAD_DLI_PCCF/Root/Health-PCCF-plus-Sante-FCCP-plus/pccf6A1-fccp6A1-can.zip

I apologize, there was a 6a version. It has been reinstated.

Here's an older note about the 6A vs. 6A1.

PCCF+ Version 6A1

Please note that an error has been identified in the 6-digit postal code weighting file (WC6DUPS) in PCCF+ Version 6A. This resulted in approximately 46,000 urban postal codes being inadvertently included in the WC6DUPS file (which is primarily used to assign population weights for rural andretired postal codes linked to two or more DAs). As a result, some of these urban postal codes may be assigned to an incorrect dissemination area. A new version of PCCF+ (version 6A1) has been created to rectify this error. The new version contains an updated WC6DUPS file where postal codes with a delivery mode type equal to A or B that were linked to 1, 2 or 3 DAs (according to the PCCFDUPS file) have been removed. The associated pointer file (WC6POINT) has also been updated. All other files and content remains unchanged.

Comparison PCCF+ version 6A and 6A1.

File Name 6A 6A1

WC6DUPS 202592 104728

WCPPOINT 69568 23865

EFT: /MAD_DLI_PCCF/Root/Health-PCCF-plus-Sante-FCCP-plus

Census Data for Ontario & Nova Scotia

Question

I have 2 separate requests:

1. Census data for Ontario (DA) with all possible variables, for 2011 (if available)

2. Census data for Nova Scotia (towns and counties) with all possible variables for 1981 onwards

For the second request, he’s perfectly willing to “take whatever you can find”.

I’ve managed to hunt down the folders in the FTP that I think I’ll need, but in terms of file names (and deciphering them) I’m still quite the amateur!

Answer

1. Not available for the Census, but it is available from the National Household Survey (NHS). DLI EFT file path: /MAD_DLI/Root/NHS_ENM/NHS-Profile_Profil-ENM/da_ad

NHS Survey User guide: http://www12.statcan.gc.ca/nhs-enm/2011/ref/nhs-enm_guide/index-eng.cfm

Comparability of the NHS estimates and the 2006 Census http://www12.statcan.gc.ca/nhs-enm/2011/ref/nhs-enm_guide/guide_4-eng.cfm#A_5_4


2. To help decipher file names, review the read me files

Eg: /MAD_DLI/Root/census_pop_recens/1981/bst—1981/1981readme.xls

Provides the file name and description:

For example: c1981-bst-csd-a10.cldat.gz Population by mother tongue (9) and sec (x)
/MAD_DLI/Root/census_pop_recens/1981/bst—1981/census-subdivision/data/c1981bst-csd-a10.cldat.gz

Friday, June 19, 2015

CCHS PUMF

Question

A researcher here heard that the CCHS PUMF is set for a June 28 release. However, the DLI page indicates a Fall 2015 release. Can you please confirm when we can expect it through DLI?

Answer

The CCHS product page list the release date of June. This will be to officially release the data products, but that does not include the PUMF.

Detailed information for 2014

Data release - November 24, 2014 (first in a series of Rapid Response releases); June 17, 2015 (2014 data); scheduled for June 24, 2015 (2013-2014 data).

<http://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&SDDS=3226>

In regards to the release of the Annual 2014 and two year 2013-2014 CCHS the subject matter confirmed that the PUMF files are usually released in the fall, following the release of the Master and Share files released in June.

Updated Products: General Social Survey (PUMF) 2010

Please note that the GSS PUMF Cycle 24 was updated.

The updated products listed below and the path to access them via the EFT site.

Updates:
(1) Main and Episode Data Files
(2) Main and Episode Data Dictionaries with Frequencies
(3) User Guide

A new series of weights is available for Cycle 24. The weights have been adjusted
to add calibration by reference day. Since analyses of activities for a given day
could be significantly affected, it is suggested to redo the analyses using the
new weighting.

There were no updates to:
No Frequency Data Dictionaries
SPSS / SAS Cards
Variable_List

Note: 'ACTCODE = 2' is reserved for episodes where the respondent refused or
did not remember what activity they were doing in EPI_Q100_X.

EFT

/MAD_DLI/Root/other-products/General Social Survey - gss/cycle24-2010

Wednesday, June 17, 2015

CCHS Annual Components and Household Weights

Question

A researcher needs access to annual components of CCHS for years 2005 to present.

The following is a link to the reference periods the researcher is hoping to receive: <http://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getInstanceList&SurvId=3226&SurvVer=1&InstaId=15282&SDDS=3226&lang=en&db=imdb&adm=8&dis=2>

The specific weight variable the researcher is interested in is called “Household Weight (WTS_HH)”, which is part of a separate file (HS_HHWT.txt) and this file is not available through DLI. Any chance these files can be made available through DLI?

Answer

The CCHS Annual Component PUMF is disseminated every two years.

In 2010 and 2012 we had both:

· one year reference period, and a

· combined two years reference periods

Previous to that, we have combined reference periods.

For annual reference periods, these are available through the RDCs:

The master files contain all variables and all records from the survey collected during a collection period. These files are accessible at Statistics Canada for internal use and in Statistics Canada’s Research Data Centres (RDC), and are also subject to custom tabulation requests.

See our Nesstar holdings for a comparison of PUMF and Masterfiles cycles available through the DLI and RDCs: <http://www62.statcan.ca/webview/>

You can view the weight variable available in the masterfile vs the PUMF.

I checked and didn’t see WTS_HH in the PUMF or masterfile. What reference period is the researcher looking at?

I also followed up with subject matter to make sure that I didn’t miss anything:

There aren’t HH weights for PUMFs, but there is for master files. The answer to the question further down in Vivek’s email, would be no, these cannot be made available through the DLI. I hopes this helps clarify.

Again, access would be granted through the RDCs.

Friday, June 12, 2015

Citizenship Statistics 1991-2001

Question
I have a researcher looking for naturalization/citizenship statistics for the years 1985-2015. We can find the information for all of the years except 1991-2001. I managed to track down the 1991 statistics from the 1991 Citizenship and Immigration Canada Annual Report but am having trouble with the other years. Immigration statistics dealing with temporary and permanent residents can be readily found but the citizenship statistics are proving more difficult.

Answer

I have found recent citizenship information from CIC only. Try contacting directly. To submit a request for data, please email Statistics Canada with a detailed description of your requested report.

Other sources of potential interest:

- National Household Survey: Immigration and Ethnocultural Diversity – Obtaining Canadian citizenship: <http://www5.statcan.gc.ca/olc-cel/olc.action?lang=en&ObjId=99-010-X201100311791&ObjType=47>

- Other historical data from the Census of Canada

Wednesday, June 10, 2015

Households with Televisions in Canada since 1950

Question

I have a researcher trying to find the number of households in Canada with television since 1950. Does this information exist? If so, where should Iook?

Answer

There are tables on the broadcasting industry and broadcasting: <http://www5.statcan.gc.ca/subject-sujet/subtheme-soustheme.action?pid=2256&id=2259&lang=fra&more=0>

Also try the office of Canadian Broadcasting: <www.tvb.ca/pages/home>

Here's some another link, unfortunately specifically colour TV and only for some years:

-<http://www5.statcan.gc.ca/cansim/a05?lang=eng&id=2030020> — for households with colour televisions. 1997-2009

Tuesday, June 9, 2015

Updated Products - Labour Force Survey (LFS) May 2015

Labour Force Survey (LFS) – May 2015

LFS data for May 2015 are now available on the EFT site.

The Labour Force Survey estimates are based on a sample, and are therefore subject to sampling variability. Estimates for smaller geographic areas, industries, occupations or cross tabulations will have more variability. For an explanation of sampling variability of estimates, and how to use standard errors to assess this variability, consult the Data Quality section in the Guide to the Labour Force Survey.

The LFS guide: http://www5.statcan.gc.ca/olc-cel/olc.action?ObjId=71-543-G&ObjType=2&lang=en&limit=0

Eft: /MAD_DLI/Root/other-products/Labour Force Survey - lfs/1976-2015/data/micro2015-05.
zip

Corporations Return Act

Question

I have a researcher who is looking for data concerning family relationships among executives/BoD members.

The Corporations Return Act includes a question (5.2) on ‘related groups’ among corporate directors and officers:


Would it be possible to order a custom tabulation from this product?

Answer


Subject matter division has confirmed that:

First off, we would require additional information from the researcher to fully understand the data that is being sought.

However, even without knowing all of the details, I can confirm that this would be an extensive cost recovery initiative on our part. The information being requested is not available on any of the standard products we publish

Please advise if the researcher is interested in pursuing?

Thursday, June 4, 2015

Most Recent PCCF+

Question
Which files are most recent PCCF+ to dowload? I am assuming that pccfa1-fccp6a1 is the most recent, but there seems to be three choices:

* pccfa1-fccp6a1 — A directory
* pccfa1-fccp6a1.zip — Is this the same directory, but zipped?
* pccfa1-fccp6a1-ver2.zip — Is this actually the most recent?

Answer

The most recent version is pccfa1-fccp6a1-ver2.zip. I have removed the previous version as is our practice to reduce confusion.

Unemployment, Incarceration, Income and Poverty Rates by Race

Question

I have a faculty member looking for a breakdown of unemployment rates, incarceration rates, median household income, and poverty rates by race, including Aboriginals, in Canada (and, if I’m lucky, rates within those groups). I’ve managed to find a few tables here and there from Public Health, Corrections Canada, and Stats Can that this faculty member felt were fine, but nothing that really hit the nail on the head. Is there somewhere else I should be looking?

Answer

Have a look at these tables:

Unemployment rates:

Unemployment rate, Canada, provinces, health regions (2014 boundaries) and peer groups, annual (Percent), 2006 to 2014 (109-5334)
Labour force survey estimates (LFS), supplementary unemployment rates by sex and age group, unadjusted for seasonality, monthly (Rate), Jan 1976 to Apr 2015 (282-0085)

Labour force survey estimates (LFS), supplementary unemployment rates by sex and age group, annual (Rate), 1976 to 2014 (282-0086)

Incarceration rates:

Adult correctional services, average counts of offenders in provincial and territorial programs, annual, 1978/1979 to 2013/2014 (251-0005)

Adult correctional services, average counts of offenders in federal programs, annual, 1978/1979 to 2013/2014 (251-0006)

Youth correctional services, average counts of young persons in provincial and territorial correctional services, annual (Persons), 1997/1998 to 2013/2014 (251-0008)

Median household income:

NHS tables - Income and Housing

CANSIM - Household, family and personal income

Poverty rates by race (including Aboriginals):

Low income and inequality

Wednesday, June 3, 2015

Release of intercensal 2015 Geography products

The Interim List of Changes to Municipal Boundaries, Status, and Names (catalogue number 92F0009X<http://www.statcan.gc.ca/cgi-bin/IPS/display?cat_num=92F0009X>) is now available in online (HTML only). The reference period of the document is January 2, 2014 to January 1, 2015.

The intercensal 2015 Census Subdivision Boundary File (catalogue number 92-162-X<http://www.statcan.gc.ca/cgi-bin/IPS/display?cat_num=92-162-X>) is now available online and on the EFT. The Census Subdivision Boundary File, Reference Guide (catalogue number 92-162-G<http://www.statcan.gc.ca/cgi-bin/IPS/display?cat_num=92-162-G>) is also available (HTML only).
DLI EFT: /MAD_DLI/Root/geo/2015/CSD_SDR

The intercensal 2015 Census Subdivision Boundary File does not replace the 2011 Census Subdivision Boundary File (92-160-X), which is a similar product available as part of the 2011 Census suite of geography products, and used in conjunction with products and services from the 2011 Census.

The 2015 Road Network File (catalogue number 92-500-X<http://www.statcan.gc.ca/cgi-bin/IPS/display?cat_num=92-500-X2015001>) is now available online and on the EFT. The Road Network File, Reference Guide (catalogue number 92-500-G<http://www.statcan.gc.ca/cgi-bin/IPS/display?cat_num=92-500-G>) is also available (HTML only).
DLI EFT: /MAD_DLI/Root/geo/2015/RNF_FRR

The intercensal 2015 Road Network File does not replace the 2011 Road Network File, which is a similar product available as part of the 2011 Census suite of geography products, and used in conjunction with products and services from the 2011 Census.

New ICS Study

Changes in wealth across the income distribution, 1999 to 2012

by Sharanjit Uppal and Sébastien LaRochelle-Côté

Insights on Canadian Society

This article examines changes in the wealth of Canadian families (i.e. total family assets minus total family debt) over the period from 1999 to 2012, with a particular focus on changes across income quintiles. The paper also examines changes in the concentration of wealth across income quintiles, as well as the characteristics of families with low income and no wealth.

To access the study released today:

Monday, June 1, 2015

Enhanced and/or Reformatted Discharge Abstract Database Files

We have completed work on the Discharge Abstract Database files that were released on May 20. The basic program used to enhance the clinical file required a small change to handle changes introduced with SPSS 23.

As with other cycles, the clinical file has been enhanced with flag variables to let users identify records of interest without looking at each of the 25 ICD-10 or 20 CCI codes separately, and the ICD-10 and CCI codes have been reformatted to match the documentation (e.g., punctuation inserted). Additionally, variables in both the clinical and geographic files that were stored as strings have been converted to numeric codes so that Stata users can have value labels attached, and to make them more useable (e.g., allowing age to be a ordinal variable rather than a nominal string).

I’ve noted a concern / question though – the number of records in the clinical file is WAY down this year. Was this once again a 2-year sample (as the read-me file indicates), or a 1-year sample this time?

Clinical file


2009-11     393,625 cases

2011-13     404,650 cases

2013-14     204,529 cases ???

It also appears to not be a 2-year sample because of the change in naming convention for this release.