Thursday, August 27, 2015

Updated Products: HES 2013 V2

Please note the updated products listed below and the path to access them via the EFT site.

Households and the Environment survey 2013

The file provides data for Canada, the provinces and census metropolitan areas and includes information on a wide range of topics, including water quality concerns; consumption and conservation of water; energy use and home heating and cooling; pesticide and fertilizer use on lawns and gardens; recycling, composting and waste disposal practices. It also provides information on the socio-demographic, income and labour force characteristics of the population.
An update to the 2013 Households and the Environment Survey Public Use Microdata File (PUMF) (catalogue number 16M0001XCD) was released on June 9th, 2013. The only change was the removal of the HW_Q10CC due to an inconsistency between the English and French versions of the corresponding response category in the questionnaire. As well, the variable HW_Q10CE was replaced with HWD10CE (“other”), which is based all responses of “yes” to the former HW_Q10CC and HW_Q10CE variables.

EFT: /MAD_DLI/Root/other-products/ Households and the Environment Survey-hes/2013

Thank you.

Wednesday, August 26, 2015

CCHS 2009-2010 and 2010 Respondents

Question

A grad student here is working with the CCHS 2009-2010 and the CCHS 2010 files. The documentation appears to indicate that the respondents from CCHS 2010 make up about half of the respondents in the CCHS 2009-2010, i.e. that the same 2010 respondents are in both files and that these are not two totally distinct groups? However, the student has asked me to confirm this with Statistics Canada, as it is “critical” to his dissertation.

Answer


I consulted the subject matter division responsible and they provided the following response.

“The same respondent cases that appear in the annual CCHS (2010) will also appear in the biannual CCHS file (2009-2010)”

Wednesday, August 19, 2015

GSS 2014 - Victimization

Question

A user is wondering when this will be available... "Am wondering if you knew when information from the 2014 GSS (Victimization) would be released. The Website shows a data release of 2015 - was wondering if it had been released yet?"

Answer

I confirmed with subject matter on your question and they don’t have a concrete date other than Fall of 2015.

Monday, August 17, 2015

After-tax Income, Quarterly

Question

I have a student who is looking for after-tax income, per capita, quarterly. We found Cansim table 2002-0603 which gives him basically what he’s looking for but it annual. Is there something like this available quarterly?

Answer

I confirmed with the subject matter division responsible for this table and they confirmed they only have annual data and do not have monthly or quarterly data.

Updated Products: International Travel Survey (ITS), 2013

International Travel Survey (ITS), 2013

The ITS 2013 PUMF is now available on the EFT.

The International Travel Survey (ITS) provides statistics on travellers, to and from Canada. The Frontier Counts component provides a full range of statistics on the number of international travellers by selected category and by type of transportation as well as the number of automobiles, trucks and other vehicles (motorcycles, snowmobiles, bicycles) entering Canada.

For more information: Record number 5005

EFT: /MAD_DLI/Root/other-products/International Travel Survey - its/2013

Friday, August 14, 2015

Updated Products: Travel Survey of Residents of Canada (TSRC), 2014

Travel Survey of Residents of Canada (TSRC), 2014
The TSRC 2014 PUMF is now available on the EFT.

The Travel Survey of Residents of Canada (TSRC) is a major source of data used to measure the size and status of Canada's tourism industry. It was developed to measure the volume, the characteristics and the economic impact of domestic travel. Since the beginning of 2005 this survey replaces the Canadian Travel Survey (CTS).

For more information: Record number 3810

EFT: /MAD_DLI/Root/other-products/Travel Survey of Residents of Canada - tsrc/2014

Wednesday, August 12, 2015

Postal CodeOM Conversion File Plus (PCCF+) version 6B(V2)

An error in the pointer file for duplicate postal code records was identified which resulted in some postal codes being processed improperly. To correct this error, a new version of PCCF+ (version 6B V2) has been released. 

The following files have been updated:

- pointer file for duplicate postal codes (pccf.1411.pccf.pointdup.txt), along with the associated duplicate postal codes file (pccf.1411.pccf.dups.txt) 

- SAS code that reads the CSD type variable found in the geographic attributes file (input_georef.sas) 
- a small revision to better for residential flag (cpcref.egmres.v1411.txt)
- air stage delivery (cpcref.airstage.v1411.txt)

EFT : /MAD_DLI_PCCF/Root/Health-PCCF-plus-Sante-FCCP-plus

Monday, August 10, 2015

New files on Statistics Canada Nesstar

We are pleased to inform you that the following are now available on the Statistics Canada Nesstar Webview site (http://www62.statcan.ca/webview/ ).

Public Master Files (metadata only):
- Canadian Tobacco, Alcohol and Drugs Survey (CTADS), Cycle 2013

PUMF:

- Canadian Tobacco, Alcohol and Drugs Survey (CTADS), Cycle 2013 – Household File
- Canadian Tobacco, Alcohol and Drugs Survey (CTADS), Cycle 2013 – Person File
- The General Social Survey (GSS), Cycle 26, 2012 - Caregiving and Care Receiving File

And more to come!...

Anyone can view the metadata for these files on the Statistics Canada Nesstar site. Only DLI Contacts however will be able to access the data files on the site (PUMFs). If you are a DLI Contact and would like to access and manipulate data files on the Statistics Canada Nesstar site, please send us an email at dli-idd@statcan.gc.ca to request a password. Please note that the PUMF data files available through Nesstar are subject to the PUMF Licence agreement.

To access the microdata housed in the Research Data Centres (RDCs), researchers must submit a project proposal to the Social Sciences and Humanities Research Council (SSHRC) and Statistics Canada.

Onion Production

Question
I have a request from a researcher looking for CCS-level stats for Quebec:

I am interested in data for dry onion production in Quebec. I am interested in finding out the following:

1. farm size, type, tenure, ownership, soil type, revenues, irrigation practices/technologies
2. farm operator age, education, years of experience

3. marketed production, yields, farm gate value, production area, production regions, exports and imports, % of provincial and national GDP

She’s extracted what she can from the Census of Agriculture, but she’d like to find stats that are more recent than 2011. If necessary, she’ll consider a custom tab.

Answer

I have confirmed with subject matter regarding your questions and they have provided the following information:

The Census of Agriculture would only have as a standard data product the number of farms reporting and the total area reported for onions by CCS and a custom request for data with Leon Laborde could address some of the additional data the researcher is asking for on a cost recovery basis, but there would likely be suppression even at the provincial level.

With respect to imports/exports, this information is available free through the Canadian International Trade Database (CIMT)

----------------------------------------------------------------------------------------------------------------------------------------

Below is the link to our Canadian International Trade Database (CIMT) to retrieve free HS06 trade data:

http://www5.statcan.gc.ca/cimt-cicm/home-accueil?lang=eng

1) Retrieve your data with one of the following options:

· Option 1: Select trading partner - Select a trading partner and specific variables (e.g., country, province, state, year, month, or frequency). Click on the appropriate button to either "Retrieve" the data or "Save as spreadsheet (CSV)" .

· Option 2: Search by commodity or Harmonized System code" - Click on "Search" and then on "Domestic Exports", "Re-exports" or "Imports" next to the commodity of your choice.

Use the multiple drop-down menus to change variables.

In some instances, hyperlinks are available and enable you to drill down to lower levels of detail. For example, clicking I-Live Animals and Animal Products shows a breakdown of all the commodity chapters found in that section, offering access to more detailed information. Use the buttons "Retrieve" to retrieve a different variable selection from the drop down menus, "Save as spreadsheet (CSV)" to save in tabulation compatible document and/or "Start Over".

To obtain more information on international trade concepts, commodity classification, releases and revisions, please click on the related buttons in the left-hand side menu.

Friday, August 7, 2015

Updated Products - Labour Force Survey (LFS) July 2015

Labour Force Survey (LFS) – July 2015

LFS data for July 2015 are now available on the EFT site.

The Labour Force Survey estimates are based on a sample, and are therefore subject to sampling variability. Estimates for smaller geographic areas, industries, occupations or cross tabulations will have more variability. For an explanation of sampling variability of estimates, and how to use standard errors to assess this variability, consult the Data Quality section in the Guide to the Labour Force Survey.

The LFS guide:
http://www5.statcan.gc.ca/olc-cel/olc.action?ObjId=71-543-G&ObjType=2&lang=en&limit=0

Eft:
/MAD_DLI/Root/other-products/Labour Force Survey - lfs/1976-2015/data/micro2015-07.zip

CANSIM Population Estimate Tables

Question

A graduate student researcher wants to merge two CANSIM population estimate tables, 109-5335 (http://www5.statcan.gc.ca/cansim/a26?lang=eng&id=1095335) and 109-5325 (http://www5.statcan.gc.ca/cansim/a26?lang=eng&id=1095325) together. His initial question was a little confusing but, after speaking with him, this is his clearer rephrasing of what he sees as a problem with the tables:

My thoughts in terms of accuracy are that the more recent 109-5335 table is more accurate than the 109-5325 in the estimates after 2006 (especially those further from that year) given that census data should have been available to benchmark the 2011 year in the former table, where the population estimates for those years would be extrapolated (with help from admin data) in the later table. In comparing the tables, it would seem that it is the post-censal estimates which show the greatest levels of disparity between the tables. Although intercensal populations are also estimates, I anticipate these will be more accurate vs projections given solid benchmarks as to where the true population values were and where they should lead to. Therefore, I see this as an accuracy concern and not a methodological concern.

If you could raise the question to Stats Canada, there may be issues of methodologies that provide a better understanding of the tables. I am also wondering why multiple census benchmarks have (seemingly) not been used in making the tables. I think these should have included 1996, 2001, 2006, and 2011. By the description of interensal periods, it looks like 109-5325 uses 1996 estimates (skipping 2001) and 109-5335 uses 2001 estimates (skipping 2006) but I might be wrong.

I do not think that merging these two tables is a good idea and that is what I told him. But if anyone had any additional information on the methodologies used for these two separate population estimate calculation, or the census benchmarks used in the two tables, other than what is provided on the CANSIM webpage, it would be much appreciated.

Answer


The subject matter division provided the following response:

First of all, the best recommendation would probably be to use CANSIM table 109-5345, which replaces CANSIM table 109-5335, as indicated in Footnote #1. This table has the most up to date and accurate population estimates for Health Regions over the 2001 to 2014 period.

Concerning the user’s concerns on accuracy, you can tell him that population estimates that are of intercensal level are generally more accurate (i.e. closer to reality) as opposed to postcensal level estimates. This is because intercensal estimates are based on the two censuses preceding and following the year in question, whereas postcensal estimates are based from the most recent available census (to which is added the estimated demographic growth, based on various administrative data sources). Contrary to the user assumptions, we are always using all census counts (adjusted for census net under coverage and incompletely enumerated Indian reserves) that are available at the moment of producing the data. I’m not sure what exactly led him to his assumptions; if he still has questions regarding methodology for producing population estimates, you could provide him my contact info and I’ll gladly help him.

Thursday, August 6, 2015

CIHR

Question

I have a researcher who wants CIHR funded data that is supposedly stored at the University of Toronto. Has anyone had experience with obtaining such data?

Answer

That sounds like it would be original research data that was funded by the CIHR. Could be one of many funded proposals. My guess is that either it will be within the control of a faculty member, research group or department (especially if the data are sensitive), or perhaps a deposit made to the UoT Libraries. While there, I didn’t receive any CIHR-funded deposits but you might try contacting the RDM Working Group, who might be able to put out a message to ask internally.

Wednesday, August 5, 2015

Error in Postal CodeOM Conversion File Plus (PCCF+) version 6B input file

An error in the pointer file for duplicate postal code records was discovered in version 6B of the Postal CodeOM Conversion File Plus (PCCF+). This error will result in the improper processing of some postal codes. A replacement file will be made available soon. Until then, please refrain from using the PCCF+ 6B.

Tuesday, August 4, 2015

Coding urban / rural for FSA's from PCCF

Question

A researcher is doing analysis on the FSA level. She is using the PCCF and wants to create a dummy variable for 0=rural and 1=urban.

Would these methods be accurate, and if so, which would be more accurate? If not, would you have any suggestions for her to recode this dummy variable?

(A) recode SACtype (6-8 = 0) and (1-5 = 1)

(B) recode POP_CNTR_RA_type (0=0) and (1-4, 6 = 1)

From the PCCF reference guide

SACtype

1 Census subdivision within census metropolitan area
2 Census subdivision within census agglomeration with at least one census tract
3 Census subdivision within census agglomeration having no census tracts
4 Census subdivision outside of census metropolitan area and census agglomeration
having strong metropolitan influence
5 Census subdivision outside of census metropolitan area and census agglomeration
having moderate metropolitan influence
6 Census subdivision outside of census metropolitan area and census agglomeration
having weak metropolitan influence
7 Census subdivision outside of census metropolitan area and census agglomeration
having no metropolitan influence
8 Census subdivision within the territories, outside of census agglomeration

POP_CNTR_RA_type

0 Rural area
1 Core
2 Fringe
4 Population centre outside CMAs and CAs
6 Secondary core

Answers

If you want to use the Canada Post definition:

For customers who have a rural address [e.g., Postal Code with a “0” (zero)] as the second character […] from <https://www.canadapost.ca/tools/pg/customerguides/Advance_CGbrm-e.pdf>

Read each character of the FSA as a single byte string (e.g.,Data list FSA1 to FSA3 1-3 (A) FSA 1-3 (A) … Value labels rururb 0 “Rural” 1 “Urban”. compute rururb=1. If (FSA2 = “0”) then rururb=0.)

The subject matter has also provided the following feedback:

It is somewhat challenging to provide meaningful feedback (deciding on A or B) without knowing and understanding the context and details for creating a binary urban-rural classification by re-classifying the SACType or POP_CNTR_RA_TYPE variables. However, I can provide some feedback in the form of a small set of cursory assumptions (or possible cautionary notes) that may help the DLI client further her choice/research/application(s).

Based on the logic provided, the DLI client is re-classifying some of the 2011 Census geography variables. The following points are my assumptions.

- The latest vintage of the PCCF is the source of the SACType and POP_CNTR_RA_type variables/data.

- Feedback has been (or will be) provided on the nature and limitations of the PCCF with regards to postal geographies (FSA and FSA-LDU). This would include details regarding the Canada Post functional urban/rural indicator expressed in the FSA itself.

For Option A:

- The client is not interested in distinguishing between POP_CNTRs inside CMAs and CAs and POP_CNTRs outside CMAs and CAs (where POP_CNTR_RA_type recoded (1-4, 6 = 1)). The residual area inside and outside CMAs and CAs is defined as “rural”.

For option B:

- Areas outside CMAs and CAs that are classified as strong and moderate metropolitan influence (MIZ) are aggregated/grouped with CMAs and CAs to form a single “urban” class (where SACType is recoded (1-5 = 1)).

Note: there is no concern that rural areas (StatCan’s rural) exist in most CMAs, CAs, and areas classified as strong and moderate MIZ.

- Areas classified within the Territories and weak and no metropolitan influence (MIZ) are aggregated/grouped to form a single “rural” class (where SACType is recoded (6-8 = 0)).

Note: there is no concern that the StatCan defined POP_CNTRs exist is most areas classified as moderate and no MIZ, and the Territories.

I can sense that option A may be more appropriate – again, without some context it is challenging to provide meaningful advice.

Please let me know if more information is needed or if there are any details regarding the context of the research/application that have not been shared with me.