Thursday, July 30, 2015

Comparing income inequality by FSA's between 2006 and 2011QAAA

Question

A researcher is having difficulties comparing income equality (for economic families or households) by FSA’s between 2006 and 2011. We would like to ask for help finding this information please.

1. Is it possible to get standard error measures for 2011, as there are for 2006 (as shown in purple below)? How would we go about getting this?

2. If that proves difficult, the researcher would like to calculate GINI coefficient, but she needs (by FSA) total income for all economic families &/or all households. The values highlighted in yellow seem too low to represent total income, what do they represent total numbers of economic families/households?

3. If so, how would we go about getting the sum total of all income (of all economic families/households) by FSA (used to calculate medians and averages)?

Illustrations:

In 2006, the Census profile 94-581-XCB2006003 by FSA shows these measures :
Profile of Forward Sortation Areas
E.g. Canada (01)   20000
Family income in 2005 of economic families - 20% sample data
8,782,350
Median family income $
66343
Average family income $
82325
Standard error of average family income $
83
Household income in 2005 of private households - 20% sample data
12,437,470
Median household income $
53634
Average household income $
69548
Standard error of average household income $
63

a. Personal income (not wanted)

In 2011, the NHS Profile by FSA shows these measures - EO1976_ID579862

2011 NHS
Canada ( 26.1%)
Household total income in 2010 of private households
13,319,250
Household income in 2010 of private households
13319255
  Median household total income $
61072
  Average household total income $
79102
Family income in 2010 of economic families
9,254,165
  Median family income $
76511
  Average family income $
94125


Answer

I consulted the responsible subject matter division and they have advised on your questions in red below:

1) The voluntary 2011 NHS replaced the mandatory census long form. As with most voluntary surveys, there are more risks associated with non-response in the NHS. In order to limit the effects of non-response, the systems and methodology normally used for the census long form were changed to introduce new methods at the collection, sampling and estimation stages.

The NHS is now a two-phase sample design, compared to the single-phase sample design in the census. This makes the estimation methodology more complex for the 2011 NHS. A two-phase sample design involves selecting a first-phase sample with data mostly collected via Internet and mail returns. The second phase consists in selecting and following up on a sub-sample of the non‑respondents from the first phase via more effective modes, i.e., telephone and in-person interviews.

Standard errors (SE) released in previous censuses were derived assuming a simple random sample without replacement. The same assumption cannot be made for the 2011 NHS, because of its more complex design.

2) It is the number of economic families or the number of households depending which total line you are looking at. For definitions of Economic Families and Households, they can refer to the Census Dictionaries for the respective years:

<http://www12.statcan.gc.ca/census-recensement/2006/ref/dict/index-eng.cfm>
<http://www12.statcan.gc.ca/census-recensement/2011/ref/dict/index-eng.cfm>

3) We do not have this available via our standard products. Your best bet would be to contact the closest regional office for a possible custom data request.

Tuesday, July 28, 2015

Land Area for CDs and CSDs

Question

I've got a researcher looking for land area for CDs and CSDs in Ontario or the 1961 and 1966 census. I've been looking through our hardcopy holdings and archive.org, and am not finding anything yet.

Does anyone have suggestions for where to find these? Alternatively, are there boundary files available anywhere?

Answer

Here is the response from subject matter:

We have no boundary files for the 1961 nor the 1966 Censuses. As for land area, I did find some information in our old hardcopy publications, for 1961, at least. This publication (hard cover, red with gold lettering): Census of Canada — 1961 — 1.1 (Vol. 1) Population — Geographical Distributions has two tables that may be useful:

Table 2: Area and density of population for counties and census divisions, 1961

Table 3: Area and density of population for incorporated cities, towns and villages of 2,500 and over, 1961.

They contain a column measuring land area in square miles. If you don’t have that publication, we can scan the tables for you. Please note that the land area was calculated expressly for the purpose of calculating population density only. Also, at that time, the calculation was very complex. People should use these numbers with caution. The mindset we have today of quantifying and measuring everything did not exist then.

For 1966, all I could find were two publications similar to today’s interim lists:

Changes in Municipal Boundaries — Annual Report, December 1968, No. 6

Changes in Municipal Boundaries — Annual Report, December 1969, No. 7.

So, these documents give changes following the 1966 Census that occurred before the 1971 Census. I’m not sure that timeline is what you’re looking for. However, within each are these tables:

Table 1: Municipalities newly created, by province, November 1968 — October 1969

Which includes an Area in square miles value, although not for every municipality listed. There is also:

Table 3: Amalgamations and annexations of municipalities, by province, November 1968 — October 1969

Which includes an Area in square miles, but it measures only the annexed area, not the total area.

I went back and found similar interim lists between 1961 and 1966, but those had no land area information at all.

CCHS Question

Question

I have a student who’d like to know if it’s possible to separate the data from the two-year PUMFs (e.g., 2007-2008) into individual years. My hunch is no, but I told her I’d get a more definitive answer.

Answer

I confirmed with subject matter, and you are correct, it is not possible to separate the years from the two year PUMF file for CCHS 2007-2008. However, individual years are available in the research data centres, see datasets available in the RDCs.

Monday, July 27, 2015

GSGVP 2010 Variables

Question

We are working with the 2010 CSGVP data at the moment, and I understand there is a corresponding data file in which the missing variables have been imputed for the same data. This is not a public released file. Is there a way to gain access to it because the variables we are interested have many missing values.

Answer


When the researcher says they are working on the 2010 CSGVP data, are they working with the PUMF or the Masterfile (in the RDCs)? The files staged for the RDCs do not include declared missing values.

The missing values for the 2010 CSGVP for the PUMF are declared in the syntax files.

Friday, July 24, 2015

1991 and 1996 CSD Income for the 65 and Over

Question

One of our students is looking for the income for age 65+ at the CSD level. He needs stats from 1991 – 2006. I found the Topic Based Tabulations for the 2006 & 2001 Census including income, age and geography he is looking for. Unfortunately, I’m not able to find anything similar for the 1991 & 1996. Do you have any ideas where to look for it?

Answer

The following are available on the DLI Beyond 20/20 Web Data Server (WDS) <https://dli-idd.statcan.gc.ca/wds/>

For 1996
Public reports > Demographics and Population > Census of Population - cp > 1996 > Basic Summary Tabulations-B2020 > Census Subdivisions
[cid:image001.gif@01D0C61D.DA38A140]<javascript:OnReportClick(25075,1,25075);>Total Population 15 Years and Over by Presence of Income (5) and Sex (3), Showing Age Groups (5A), for Canada, Provinces, Territories, Census Divisions and Census Subdivisions, 1996 Census (20% Sample Data)<javascript:OnReportClick(25075,1,25075);> [cid:image004.gif@01D0C61D.DA38A140] <javascript:OnReportFolderTableSummary(25075,%2025075,%201,%2013733)>
And others...

For 1991
Public reports > Demographics and Population > Census of Population - cp > 1991 > Basic Summary Tabulations - Census Subdivisions

Number, aggregate and average 1990 total income and employment income (7) of population 15 years and over by sex (3) (20%)Number, aggregate and average 1990 total income and employment income (7) of population 15 years and over by sex (3) (20%)

Population 15 years and over by sex (3) and 1990 income groups (9) (20%)

1971 CSD Boundary Files

Question

Does anyone know where I might find CSD boundaries for the 1971 census? There’s an attribute table on the EFT, but the researcher who’s asking doesn’t think he can convert it to a shapefile. 
If CSD boundary files aren’t available, the researcher says CD boundaries will suffice, if anyone has those.

Answer

I consulted subject matter and they don’t have any other files available for this.

There is a SPSS program that converts files from ASCII to SPSS – if you run it, you’ll have an SPSS file that you can save as a dBase file which includes UTM and Lambert Coordinates, which should be able to be mapped within GIS. However, it’s not a boundary file. I surmised that since there is only one record per Enumeration Area, the file records the centroid of the enumeration area (rather than the boundaries). Looking at geog-attrib-file-71-guide.pdf confirmed that (page 13). So I’m not sure it will do what the researcher wants.

Unfortunately there’s not enough information in the attributes file to produce polygons, which is what the research really wants. Still, I’ll let him know that this is available. It may be as close as he’s going to get.

A few months ago I converted the 1971 gtf71.txt file and the embedded UTM coordinates to a .shp file with lat/long. While it is not a boundary file, each point represents the population weighted centre of each EA along with a population count in the attribute table.

Or as stated in the related PDF guide: for each EA the geocoding programme has chosen a point (centroid) situated at approximately the centre of gravity of the enumeration areas population (page 13). 

We can send these over if they interest the researcher

Thursday, July 23, 2015

Updated Products - ICO Q2 2015

Please note the products listed below and the path to access them via the DLI EFT.

Inter-corporate ownership (ICO) – Q2 2015

This product is a directory of corporate ownership in Canada that provides information on every individual corporation that is part of a group of commonly controlled corporations with combined assets exceeding $600 million or combined revenue exceeding $200 million. Individual corporations with debt obligations or equity owing to non-residents exceeding a net book value of $1 million are covered as well.

Ultimate corporate control is determined through a careful study of holdings by corporations, the effects of options, insider holdings, convertible shares and interlocking directorships.

The information presented is based on non-confidential returns filed by Canadian corporations under the Corporations Returns Act and on research using public sources such as Internet sites. Entries for each corporation provide both the country of control and the country of residence.

/MAD_DLI/Root/other-products/Inter-corporate ownership – ico

2013 National Graduates Survey

Question

Would it be possible to get the questionnaire for the 2013 National Graduates Survey added to both the FTP site and to the Nesstar record?

Answer

The questionnaire has been added to the EFT and the Nesstar study.

Tuesday, July 21, 2015

Dairy Factory Production and Stocks

Question

A researcher here would like to know if we have access to data associated with a survey relating to Dairy Factory Production and Stocks. More specifically, he is looking for the monthly production of skim milk powder (tonnes) in Ontario from July 1998 to July 2014. In CANSIM table 003-0029 most of these cells are x'ed out. He would settle on getting Yearly production numbers.

He has consulted the Data on dairy stocks in CANSIM table 003-0033 <http://www5.statcan.gc.ca/cansim/a26?lang=eng&retrLang=eng&id=0030033&paSer=&pattern=&stByVal=1&p1=1&p2=31&tabMode=dataTable&csid=>, but it doesn't have the data he's looking for.

Answer


I'm sorry to report that subject matter has confirmed that this data is suppressed for confidentiality within the CANSIM tables and is also not available through a custom tabulation.

Data from CIHI

Question

One of our students is seeking data on the prevalence of fatigue in Canada [Billing code from the ICD-10 (R53)].

Is this something she could access through DLI's pilot? If so, where do we start?

Or is this something she needs to request through the Graduate Student Data Access Program? (I read through the conditions and she's eligibile to use the service.)
https://www.cihi.ca/en/data-and-standards/access-data/the-graduate-student-data-access-program-gsdap

Answer

The Discharge Abstract Database (DAD) captures administrative, clinical and demographic information on hospital discharges (including deaths, sign-outs and transfers). CIHI is running a pilot project in which registered DLI users can access two research analytical files that contain de-indentified samples from the Discharge Abstract Data (DAD) for several sample years. Each file includes record-level data; one file focuses on clinical data, the other on geographic information. The DAD data files include information on key demographic, clinical and case mix variables.

Because of licensing restrictions, these files are available through the DLI’s Electronic File Transfer (EFT) service, which the local DLI contact can access.Please contact your Local DLI contact whose information is below for a copy of those files.

She should also be able to estimate from the DAD - bearing in mind that the populations of BC and Quebec are excluded from the samples.

For example, in the 2013 sample, 595 records reported R53 in "ICD10 Diagnostic code 1: most responsible for patient during hospitalization"; it's recorded in 17 of the 25 ICD variables. I didn't create any flags for individual ICD10 codes - R53 is collapsed into the ICDF214 flag variable, which perhaps should have fatigue added to the variable label.

Updated Products: General Social Survey (PUMF) 2013

Please note that the GSS PUMF Cycle 27 was updated.

The updated products listed below and the path to access them via the EFT site.

Updates:

- Truncated response category labels: Full text added to the variable notes;
- Spouse/partner variables: Response category for "no spouse/partner in the household" and universe statements revised.

EFT

/MAD_DLI/Root/other-products/General Social Survey - gss/cycle27-2013

Question regarding CCHS

Question

I have a question from a student – he is requesting “information on how the specific question was selected/developed/tested/validated for the survey.” I think he is interested to see what processes, if any, in developing these types of survey questions.

Appreciate any information you might be able to acquire.
This is the Statistics Canada measure that we talked about:

SDC_Q7B
SDC_7AA

Do you consider yourself to be:

1 ... heterosexual? (sexual relations with people of the opposite sex)
2 ... homosexual, that is lesbian or gay? (sexual relations with people of your own sex)
3 ... bisexual? (sexual relations with people of both sexes) DK, RF

From: Canadian Community Health Survey (CCHS) Annual Component - 2014 Questionnaire http://www23.statcan.gc.ca/imdb-bmdi/instrument/3226_Q1_V11-eng.pdf

Answer

Please see below for response from subject matter,

The origin of the SDC module questions are from the National Population Health Survey. All questions in the Canadian Community Health Survey are qualitatively tested prior to being asked of Canadians. The most recent qualitative testing of these SDC questions for the CCHS were from February to March 2014. During the qualitative testing session all questions of this module, including the sexual orientation question, were answered without any additional questions or reservations on part of the respondents.

Question SDC_7AA on self-identified sexual orientation has been part of the CCHS since Cycle 2.1, and apart from age requirements has remained the same since its introduction into the survey.

Some relevant information on asking similar questions in social surveys can be found from the following academic sources:

Michaels, S. and Lhomond B. 2006. Conceptualization and measurement of homosexuality in sex surveys: a critical review. Cad. Saúde Pública, Rio de Janeiro, 22(7): 1365-1374.

Smith et al. 2003. Sex in Australia: Sexual identity, sexual attraction and sexual experience among a representative sample of adults. Australian and New Zealand Journal of Public Health. 27(2): 138-145.

GSS -- TIme use

Question

I see that there was a Cycle 29 pilot for which the data is not being released: http://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&SurvId=152781&InstaId=152570&SDDS=4503

Does "pilot" imply that there will be a full survey coming soon for which the data will be released — perhaps in 2015? Could this be confirmed, and could we get a tentative date of release? I don't see any mention of this at http://www.statcan.gc.ca/eng/dli/prod_date.

Answer

Here is the response from subject matter

Yes, Cycle 29 is in the pilot phase right now and will be in collection next year. Whenever we run a GSS cycle, we always do a pilot a year before. The release dates are not yet finalized at this point but I would say late 2016 as a rough estimate. Were still in collection of Cycle 28 right now.

Questions about 1986 geographic (DCF) files

Question

What is the difference between

· Agec86cd.e00 and crd86.e00 (agriculture ecumene files)
· Popin86.e00, popline86.e00, and polout86.e00 (population ecumene files)

I have converted these (and all the other 1986 files) to shape file format, but I’m not sure why there are the multiple files, or what they hold. Neither the b01.pdf file nor the inventories explain this.

Answer


Most likely these are a combination of the spatial boundary files that make up the ecumene boundaries.

For example the population ecumene usually consists of 3 separate layers including the ecumene boundary layer, the census divisions boundary layer, and the province boundary layer. However those names don’t really line up.

What do the converted shp files show you? Can you derive from the attribute table what they may cover?

After consulting with subject matter I hope this helps clear up the confusion:

AG
Agec86cd.e00:

This file seems to follow CDs in agricultural areas of Canada. Though in sparser areas the file takes into account the ecumene surrounding specific CSDs.

So in areas like populated southern Ontario and the prairies (the southern halves of Manitoba and Saskatchewan and eastern Alberta) follows the CDs. In sparser areas, a buffer appears to have been created that follows the smaller, more populated areas.

An example of this would be CSD 5939007 where a buffer has been created around it and added to Agec86cd.e00. See the below screenshot (CSD = orange, Agec86cd.e00 = blue)



The data associated with Agec86cd.e00 (note there is no CSD data)

OBJECTID 410
SHAPE Polygon
AREA 650083300
PERIMETER 93379.83
AGEC86_CD2_ 53
AGEC86_CD2_ID 317
CD86 5939
PR_ID 59
CD 39
e00_centroid_x 4509025.5
e00_centroid_y 2022825.6

The field CD86 references the CD around it. Other examples may overlap several CDs (if they are in the less populated areas) and has been given a CD value that it seems to overlap – there are duplicates in this field, obviously.

There is no apparent UID – field AGEC86_CD2_ seems to be it – and it is just sequential numbers.

crd86.e00:

This file is only Alberta and southern regions of Saskatchewan and Alberta. The file seems to group CSDs or CCSs but not CDs. There is no buffering or adjustments made to this file. It does contain data names that may correspond with Agricultural regions (CRDNAME examples include 1A and 3AS (Saskatchewan), and numbers 1 to N (Alberta, Manitoba).

OBJECTID 76
SHAPE Polygon
AREA 26058500000
PERIMETER 734625
CRD86_ 36
CRD86_ID 36
GLABEL CRD4781
TILE SAS8A
TAF S08A
CRDNAME 8A
e00_centroid_x 5464762.5
e00_centroid_y 1933594.3


The two bolded fields have the name mentioned above an a unique ID – the label.

POP
popin86.e00:

Ecumene of populated areas – follows the CD boundaries in densely populated areas but in more northern areas or less densely populated areas it may be a buffer of a CSD (like the Ag file above) OR it may be what appears to be a manually created polygon that takes two CSDs and joins a region in between or on the edge of one where it may be more densely populated. Has a CDUID associated with it:

OBJECTID 435
SHAPE Polygon
AREA 3013117000
PERIMETER 275614.5
POPIN86_ 89
POPIN86_ID 116
CD86 4718
e00_centroid_x 5552235.5
e00_centroid_y 2057639.6
popline86.e00:

The above file but no divisions into CDs (all of the prairies is one polygon; all of the western side of the St. Lawrence/Southern Ontario region is one polygon).

popout86.e00:

The same as popin86 but the inverse. Follows CD boundaries but has no populated regions included. It does contain more northern areas (and non-coastal) that are in popin86. But these can be excluded using the CD88 field where the value is “OUT” – this will also exclude Great Bear and Great Slave lakes.

Here is a screenshot of the three together:
Blue lines: popin
Pink: popout
Grey line: popline




EXPORTING FILES:

If you export to a shapefile from the “umbrella” of the E00 file it will export all the data (including the “empty” files).

If you export only the polygon file with attributes you should be okay and only get the polygon file.

Monday, July 20, 2015

Quality of Socio-Cultural CT-level Data from the National Household Survey

Question

A francophone student has been asked to give a conference presentation on changes to urban aboriginal women’s housing (a fair bit of migration within the city). He wants to know if he should make the presentation or not, as he is worried about the validity of his results re: data quality and CT-level non-response rates.

As his research was at the CT level, I gave him two sources to help assess his data quality. Did I miss any technical documentation? The first source is a StatCan web page, and I compiled the second source from the NHS profile (for which there may already be a list, but I wasn’t able to find it today).

1) Statistique Canada (2013), ENM Liste des secteurs de recensement (SR) non diffusés.
http://www12.statcan.gc.ca/nhs-enm/2011/ref/sup_CT-SR-fra.cfm

2) Global Response Rate РCensus Tracts РNHS 2011 / Taux global de r̩ponse - Secteurs de recensement - ENM 2011

The attached table lists the non-response rates of non-suppressed Census Tracts from the National Household Survey, in other words, those with a non-response rate of less than 50 %.

Answer

I consulted the subject matter division regarding your questions, please find their responses below:

As far as information about Census Tracts and the 2011 NHS, you have provided all of the correct links, I would just add to that the Aboriginal Peoples Technical Reportand the NHS User Guide.

French version of the Technical Report: http://www12.statcan.gc.ca/nhs-enm/2011/ref/reports-rapports/ap-pa/index-fra.cfm

And the NHS User Guide: http://www12.statcan.gc.ca/nhs-enm/2011/ref/nhs-enm_guide/index-fra.cfm

In particular this note from the Aboriginal Peoples Technical Report would be of use:

Classement recoupé des variables relatives au logement
Les variables relatives au logement sont souvent croisées avec d'autres variables dans un tableau pour permettre l'analyse plus approfondie d'un sujet donné. Les utilisateurs de données doivent noter que les estimations seront susceptibles de présenter une plus grande variabilité attribuable à l'erreur d'échantillonnage lorsqu'ils examinent de petites populations, soit en sélectionnant des régions géographiques de petite taille ou en croisant plusieurs variables

Friday, July 17, 2015

PCCF+ version 6B official release

Postal CodeOM Conversion File Plus version 6B (official release)

This is the official release of the Postal CodeOM Conversion File Plus (PCCF+) version 6B based on the 2011 Census. This file reflects postal code data from the Canada Post Corporation up to and includingNovember 2014.
The Postal CodeOM Conversion File Plus (PCCF+) is a SAS© control program and set of associated datasets derived from the Postal CodeOM Conversion File (PCCF), a postal code population weight file, the Geographic Attribute File, Health Region boundary files, and other supplementary data. PCCF+ automatically assigns a range of Statistics Canada’s standard geographic areas and other geographic identifiers based on postal codes. The PCCF+ differs from the PCCF in that it uses population-weighted random allocation for postal codes that link to more than one geographic area.
What’s new?

The postal code reference date for the Postal CodeOM Conversion File (PCCF) and the Postal CodeOM Conversion File Plus (PCCF+) is November 2014.

This release has been updated to include 2014 health region boundaries. Note that Ontario Public Health Units are now shown as ‘alternate health regions’, as in earlier versions of PCCF+.

Records with the same ID and postal code appearing more than once in the input dataset will now be assigned to the same geography (similar to PCCF+ Version 5K).

The weighted conversion file (WCF) includes revised weights for Indian reserves for 2011 (similar to PCCF+ Version 5K).

Where postal codes in the PCCF are not completely geocoded (missing DA), they will now be coded from the first five characters, using census population weights.

The residential flag (ResFlag) field has been updated to be more conservative with respect to the non-residential flag (-) and to be more inclusive with respect to the residential flag (+).

The institutional flag (InstFlag) field has been updated and as a result, the hospital flag (HOSP) has been removed as it is now redundant.

Users can now read in text files.

The final output datasets are now exported as .txt and .csv files.

The coding precision (Prec) field has been redefined to more meaningfully describe the precision of the geographic coding of each record by PCCF+.

EFT : /MAD_DLI_PCCF/Root/Health-PCCF-plus-Sante-FCCP-plus

Thursday, July 16, 2015

PCCF+ version 6B release date

Question

When will these be available through the DLI?

Postal CodeOM Conversion File Plus version 6B (official release)

This is the official release of the Postal CodeOM Conversion File Plus (PCCF+) version 6B based on the 2011 Census. This file reflects postal code data from the Canada Post Corporation up to and including November 2014.
The Postal CodeOM Conversion File Plus (PCCF+) is a SAS© control program and set of associated datasets derived from the Postal CodeOM Conversion File (PCCF), a postal code population weight file, the Geographic Attribute File, Health Region boundary files, and other supplementary data. PCCF+ automatically assigns a range of Statistics Canada’s standard geographic areas and other geographic identifiers based on postal codes. The PCCF+ differs from the PCCF in that it uses population-weighted random allocation for postal codes that link to more than one geographic area.
What’s new?

The postal code reference date for the Postal CodeOM Conversion File (PCCF) and the Postal CodeOM Conversion File Plus (PCCF+) is November 2014.

This release has been updated to include 2014 health region boundaries. Note that Ontario Public Health Units are now shown as ‘alternate health regions’, as in earlier versions of PCCF+.

Records with the same ID and postal code appearing more than once in the input dataset will now be assigned to the same geography (similar to PCCF+ Version 5K).

The weighted conversion file (WCF) includes revised weights for Indian reserves for 2011 (similar to PCCF+ Version 5K).

Where postal codes in the PCCF are not completely geocoded (missing DA), they will now be coded from the first five characters, using census population weights.

The residential flag (ResFlag) field has been updated to be more conservative with respect to the non-residential flag (-) and to be more inclusive with respect to the residential flag (+).

The institutional flag (InstFlag) field has been updated and as a result, the hospital flag (HOSP) has been removed as it is now redundant.

Users can now read in text files.

The final output datasets are now exported as .txt and .csv files.

The coding precision (Prec) field has been redefined to more meaningfully describe the precision of the geographic coding of each record by PCCF+.

For more information on this product:
http://www5.statcan.gc.ca/olc-cel/olc.action?lang=en&ObjId=82F0086X&ObjType=2

Answer

We are aiming to have this available to the DLI on the EFT for Monday July 20th

Wednesday, July 15, 2015

The one where the web site's change language button does not work

Question

I don’t think this has been reported. Many French and English links (from the change language button to the top right) are not working on the traditional STC web pages. Here are just a few of the pages (and their subpages) where the French and English change language links do not work. Is this because these pages are about to be phased out? We can still get to the correct pages by manually substituting the eng with fra for these pages.

https://www12.statcan.gc.ca/nhs-enm/2011/ref/index-eng.cfm
https://www12.statcan.gc.ca/nhs-enm/2011/ref/index-fra.cfm

https://www12.statcan.gc.ca/nhs-enm/2011/dp-pd/index-fra.cfm
https://www12.statcan.gc.ca/nhs-enm/2011/dp-pd/index-eng.cfm

https://www12.statcan.gc.ca/nhs-enm/2011/as-sa/index-fra.cfm
https://www12.statcan.gc.ca/nhs-enm/2011/as-sa/index-eng.cfm

https://www12.statcan.gc.ca/nhs-enm/2011/rt-td/index-fra.cfm
https://www12.statcan.gc.ca/nhs-enm/2011/rt-td/index-eng.cfm

https://www12.statcan.gc.ca/nhs-enm/news-nouvelles/index.cfm?Lang=ENG&NEWS_TYPE_ID=3
https://www12.statcan.gc.ca/nhs-enm/news-nouvelles/index.cfm?Lang=FRA&NEWS_TYPE_ID=3

https://www12.statcan.gc.ca/AAS-DAR/index-fra.cfm
https://www12.statcan.gc.ca/AAS-DAR/index-eng.cfm

https://www12.statcan.gc.ca/census-recensement/index-fra.cfm
https://www12.statcan.gc.ca/census-recensement/index-eng.cfm

https://www12.statcan.gc.ca/census-recensement/2011/geo/index-fra.cfm
https://www12.statcan.gc.ca/census-recensement/2011/geo/index-eng.cfm

Answer


I passed your observations on to Dissemination Division, and they provided me with the following response –

At first I could not reproduce the problem. The toggle works fine for me on both Net A and Net B. However, then I noticed that all the links that were provided are “https” whereas they should be “http.” If you use “https” the toggle does not work. You can ask the client to try using “http” and see if it resolves the problem. I am curious to know why they are using “https” though. Perhaps the issues was in copying and pasting the links.

T1 Family File for census tracts

Question

1) A researcher here is interested in the “T1 Family File”, which appears to be Annual Income Estimates for Census Families and Individuals: http://www23.statcan.gc.ca/imdb/p2SV.pl?Function=getSurvey&SDDS=4105&lang=en&db=imdb&adm=8&dis=2

2) As indicated in an earlier DLI posting (October 28, 2010) there are a group of tables on this in CANSIM: http://www5.statcan.gc.ca/COR-COR/COR-COR/objList?lang=eng&srcObjType=SDDS&srcObjId=4105&tgtObjType=ARRAY which is great if you are looking for Canada, provinces, or CMA-CA and non CMA-CA.

3) However, there seems to be some indication that this data might be available for Census Tracts: http://www23.statcan.gc.ca/imdb-bmdi/document/4105_D5_T1_V11-eng.pdf on page 12 / 61, for example, the first paragraph mentions “census tracts”.

4) Can you please confirm if T1FF data is available to researchers at the Census tract level, via the RDC, custom tabulations, … ?

Answer

I contacted subject matter regarding your questions and the responded with the following:

“comments are made on each question separately:

1) This link provides information on TIFF data, annual release, most current year is 2013. The researcher can read the details as required.

2)  The file enclosed provides a list of all our standard tables with their corresponding CANSIM number. Which year each table started being available and a description of each etc. Canada, provinces and CMA geography is available from reference year 2000, CA geography is available from 2008.

3) Census tract data is not available on CANSIM, however, can be purchased for all the standard products described in the enclosed file for a cost-recoverable fee. Cost depends on the number of tables and years requested. Census Tracts can be obtained via custom extraction. I won’t go into a lot of detail here, custom requests are much more complex and more costly than standard products.”

4) If the researcher would like to have an estimate for the custom tabulations please have them provide the specific tables and years. Alternatively I believe this would be possible for the researcher to obtain Census Tracts through the RDC however the process to get access is quite extensive and may not meet the tire requirements of the researchers.

Tuesday, July 14, 2015

GSS on Odesi - 2004 Cycle 18 - Victimization

Question

I have a question about Odesi. I tried to download the dataset for de GSS 2004_cycle 18, Victimization, Main file. But the file I downloaded was labelled: gss-12M0018-E-2002-c-18-m.zip.

Every other dataset I downloaded from the site was labelled with the proper year. I was wondering if it is the dataset for the GSS 2004_cycle 18 and if not, how could I obtain it.

Answer

The file the user downloaded from <odesi> is indeed the correct 2004 file, however it is mislabeled. I checked the Statistics Canada user guide to verify the frequency counts and they match the file. We will replace the file with the correct label. 

Commuting flows table - interpretation issue or error?

Question

I have a researcher who was looking to use NHS commuting flows table, "Commuting Flow - Census Subdivisions: Sex (3) for the Employed Labour Force Aged 15 Years and Over Having a Usual Place of Work, for Census Subdivisions, Flows Greater than or Equal to 20" (99-012-X2011032) for one of his classes this year, but he's not sure he can trust the information he's seeing.

The CSD he brought to my attention is the village of Gagetown (1304005). First off I'll tell you that it has a population of 698 and is suppressed in the NHS profile site. Now to the juicy stuff. When my researcher uses the above mentioned commuting flow table and looks at Gagetown VL as POW (GNR 36.1) six place of residence CSDs appear. It is at this stage that I have the question: how do we interpret these numbers (see table below)?

POW (Gagetown VL)

POR                                  Total
Lincoln P                             60
Burton P                              95
Oromocto T                        410
New Maryland VL                20
Fredericton CY                    55
Surrey CY                            20

The obvious interpretation is that there are roughly [660] people who commute to the village of Gagetown for their usual place of work (plus any numbers which didn't make the 20ppl/CSD threshold); however, that doesn't make a lot of sense when you think about the reality of the situation. First, as I mentioned above, there are about 700 people who live in the village and presumably some of them work there, too. Then when you look at the patterns above it seems even more suspicious. My researcher and I wonder if 5th Canadian Division Support Base Gagetown, formerly and still locally known as CFB Gagetown, got confused with the village of Gagetown. Are we just misinterpreting the numbers or does this seem possible?

In the NHS questionnaire, question 46 asks "At what address did this person usually work most of the time?" [their emphasis]. There are instruction to use the city or town rather than the metropolitan area of which it is part, and the options were "City, town, village, township, municipality or Indian reserve." There is also a note indicating that if the place of work is different than the address of the employer (example school teacher works at a school rather than at a school board), one is supposed to indicate the address where one actually works. However, there is some question as to whether or not CFB Gagetown is in Oromocto, where I think most of the administrative offices are, or in Burton Parish, where a good part of the training area actually takes place. I would be willing to bet that a good number of people who answered the question could have easily entered 123 Sesame Street, Gagetown, NB. What I am not willing to bet is that there were between 15 and 25 people from Surrey BC who worked in the village of Gagetown, or even that there were [410] from Oromocto and [55] from Fredericton who commuted to Gagetown VL.

I know that random rounding can affect numbers, but this seems a little more complicated that that. Has anyone else noticed these kinds of issues? Hopefully we're just misunderstanding the table. What my researcher is really worried about is that he can't trust a number of the other NHS tables he's had occasion to question.

Luckily Gagetown P doesn't seem to have been mixed up in this schmozzle.

Answer

We reported your observation to subject matter for more information. Here is the response we received from Subject Matter.

We have examined further data related to individuals working in the village of Gagetown. Most respondent do work on CFB Gagetown and not in the village of Gagetown. This error has been flagged for future processing and coding tools such as Reference files will be corrected.

On another note, data users must be aware that the Place of residence is the location where respondents are enumerated and do not necessarily represent where respondents are located at the time of the survey. Therefore some commuting flows will be unusual such as a Place of Residence in British-Columbia and a Place of Work in New-Brunswick.

Updated Products: Public Service Employee Survey (PSES), 2014

Public Service Employee Survey (PSES), 2014

A new table for PSES 2014 is now available on the EFT.

The primary objective of the survey is to obtain the views of federal public service employees about their workforce, workplace and leadership. The survey results highlight where organizations are doing well and identify areas for improvement to help organizations develop informed action plans to address people management issues.

For more information: Record number 4438

EFT: /MAD_DLI/Root/other-products/Public Service Employee Survey - pses/2014

Monday, July 13, 2015

Updated Products: Travel Survey of Residents of Canada (TSRC), 2011, 2012, 2013

Travel Survey of Residents of Canada (TSRC), 2011, 2012, 2013

Standard and Non Standard Tables for 2011, 2012 and 2013 are now available on the EFT.

The Travel Survey of Residents of Canada (TSRC) is a major source of data used to measure the size and status of Canada's tourism industry. It was developed to measure the volume, the characteristics and the economic impact of domestic travel. Since the beginning of 2005 this survey replaces the Canadian Travel Survey (CTS).

For more information: Record number 3810

EFT:
/MAD_DLI/Root/other-products/Travel Survey of Residents of Canada - tsrc/2013/Standard Tables

/MAD_DLI/Root/other-products/Travel Survey of Residents of Canada - tsrc/2013/Non Standard Tables

/MAD_DLI/Root/other-products/Travel Survey of Residents of Canada - tsrc/2012/Standard Tables

/MAD_DLI/Root/other-products/Travel Survey of Residents of Canada - tsrc/2012/Non Standard Tables

/MAD_DLI/Root/other-products/Travel Survey of Residents of Canada - tsrc/2011/Standard Tables

/MAD_DLI/Root/other-products/Travel Survey of Residents of Canada - tsrc/2011/Non Standard Tables

DLI Webinar on Canadian Survey on Disability (CSD)

Question 

Will the DLI webinar on Canadian Survey on Disability (CSD) presented by Social and Aboriginal Statistics Division (SASD) be available through an RDC? Was a synthetic file ever made available?

Answer

The CSD is available through the Research Data center (RDC) program. Unfortunately, a PUMF will not be produced. The subject matter confirmed that – Yes, the CSD is available in the RDCs as well as through RTRA. However, there is no synthetic file available to researchers.

Friday, July 10, 2015

Labour Force Survey (LFS) – June 2015

Labour Force Survey (LFS) – June 2015

LFS data for June 2015 are now available on the EFT site.

The Labour Force Survey estimates are based on a sample, and are therefore subject to sampling variability. Estimates for smaller geographic areas, industries, occupations or cross tabulations will have more variability. For an explanation of sampling variability of estimates, and how to use standard errors to assess this variability, consult the Data Quality section in the Guide to the Labour Force Survey.

The LFS guide: http://www5.statcan.gc.ca/olc-cel/olc.action?ObjId=71-543-G&ObjType=2&lang=en&limit=0

Eft: /MAD_DLI/Root/other-products/Labour Force Survey - lfs/1976-2015/data/micro2015-06.zip

Updated Products - Labour Force Survey (LFS) June 2015

Labour Force Survey (LFS) – June 2015

LFS data for June 2015 are now available on the EFT site.

The Labour Force Survey estimates are based on a sample, and are therefore subject to sampling variability. Estimates for smaller geographic areas, industries, occupations or cross tabulations will have more variability. For an explanation of sampling variability of estimates, and how to use standard errors to assess this variability, consult the Data Quality section in the Guide to the Labour Force Survey.

The LFS guide: http://www5.statcan.gc.ca/olc-cel/olc.action?ObjId=71-543-G&ObjType=2&lang=en&limit=0


Eft: /MAD_DLI/Root/other-products/Labour Force Survey - lfs/1976-2015/data/micro2015-06.zip

International Travel Survey (ITS) SAS and SPSS files

Question

I was just having a look at the International Travel Survey (ITS) files and I noticed that a fairly large proportion of them don’t have SAS or SPSS files to go along with the ASCII data.

Specifically:

SAS is only available from 1990-1996. SPSS is available for 1990-2000, 2007 and 2011-12, with partial coverage for 2001.

Are there any plans to fill the gaps?

Answer

Because of resource constraints, the subject matter division released the data files without any supporting syntax documentation. At the time in the DLI unit, we did not have the capacity to create these syntax files from scratch. We are reviewing resources and how we might be able to address these gaps in the near future.

Thursday, July 9, 2015

GSS27 - sexpr and prtypec variables

Question

Appendix D – Content Comparison of the GSS27 User Guide lists 2 variables – SEXPR and PRTYPEC – and notes that they are the same as GSS22 (2008). Both of these variables are in the PUMF for GSS22 but neither are in the PUMF for GSS27. It appears that this Content Comparison is listing the variables in the two PUMFs and not any master file variables.

Why were the SEXPR and PRTYPEC variables not included in the GSS27 PUMF? Since they are in GSS22 PUMF it seems odd that they are excluded from the GSS27 PUMF when the Content Comparison list says they are the same.

Answer

See response below from Subject Matter.

Thank-you for your interest in Cycle 27 Social Identity (SI). Every effort is made to retain variables. However it is not possible in all cases and we understand the difficulty users may experience because of this. Suppression of a given variable is undertaken per survey and case-by-case basis – decisions are not necessarily limited to a given cycle. The General Social Survey (GSS) is cross-sectional survey where respondents are randomly selected each survey. In this instance, the variable suppression from the PUMF was undertake for confidentiality purposes; low cell counts for given population(s). Custom tabulations are potentially available through SASD client services section.

Tuesday, July 7, 2015

Transgender Demographics

Question

We’ve had a question from a student who’s looking for any demographic information pertaining to transgender / transsexual persons. We’ve found references to surveys conducted or ongoing in the US, but very little Canadian info.

Answer

I consulted several sources, and we do not collect that information at StatCan.

However, a few links that may be of potential interest:


2011 Census Consultations, Chapter 1 Demographic characteristics
https://www12.statcan.gc.ca/census-recensement/2011/consultation/ContentReport-RapportContenu/Chapters-Chapitres/ch1-eng.cfm

2016 Census Program Consultation
Requirements for new data
http://www12.statcan.gc.ca/census-recensement/2016/consultation/ContentReport-RapportContenu/data-donnees-eng.cfm

From Parliament of Canada - Diversity of sexual orientations and identities: How is Canada doing?
http://www.parl.gc.ca/Content/LOP/ResearchPublications/2013-90-e.htm

Monday, July 6, 2015

2006 APS Available in Stata?

Question

I see these are available via the EFT site in SPSS and SAS. Are they also available (elsewhere?) in Stata?

Answer

As far as I know Statistics Canada doesn't provide files in a Stata format. There is a good site at UCLA’s idre (Institute for Digital research and Education) http://www.ats.ucla.edu/stat/stata/faq/convert_pkg.htm that provides information of working with SPSS, SAS and Stata files including information on how to convert one format to another.

From the Nesstar platform (http://www62.statcan.ca/webview/) you can download the file in various formats. You can even download the file in Stata v.8 and v.7.
For more information on the DLI's Nesstar, see Accessing and Citing DLI Data.

Friday, July 3, 2015

LFS Variable Question - not in the PUMF

Question

A student is doing his undergrad thesis on the topic of "difference in return to education between immigrants and native born Canadians" and was searching for the data on SLID and LFS for the wages of the immigrants with respect to levels of education from their native countries…

I have contacted Statistics Canada and they told me to contact you for the access to the data.”

According to the LFS questionnaire, there is the question: “In what country did . . . complete his/her highest degree, certificate or diploma?”. However, this variable does not appear to be available in the PUMF. Am I missing something or is it indeed not available? Although chances are he won’t have the time/money for a custom tab, it would still be good to know if that would be a possible route.

Or if anyone has a suggestion for another data source.

Answer


I have consulted with the subject matter division responsible and they confirmed that the immigrant variables are not available on the PUMF. However, we have 4 tables on CANSIM that the user could use.

Labour force survey estimates (LFS), by immigrant status, age group, Canada, regions, provinces and Montreal, Toronto, Vancouver census metropolitan areas

http://www5.statcan.gc.ca/cansim/a26?lang=eng&retrLang=eng&id=2820102&paSer=&pattern=&stByVal=1&p1=1&p2=50&tabMode=dataTable&csid=

Labour force survey estimates (LFS), by immigrant status, sex and detailed age group, Canada annual

http://www5.statcan.gc.ca/cansim/a26?lang=eng&retrLang=eng&id=2820104&paSer=&pattern=&stByVal=1&p1=1&p2=50&tabMode=dataTable&csid=

Labour force survey estimates (LFS), by immigrant status, educational attainment, sex and age group, Canada

http://www5.statcan.gc.ca/cansim/a26?lang=eng&retrLang=eng&id=2820106&paSer=&pattern=&stByVal=1&p1=1&p2=50&tabMode=dataTable&csid=

Labour force survey estimates (LFS), by immigrant status, country of birth, sex and age group, Canada

http://www5.statcan.gc.ca/cansim/a26?lang=eng&retrLang=eng&id=2820108&paSer=&pattern=&stByVal=1&p1=1&p2=50&tabMode=dataTable&csid=