Thursday, June 29, 2017

NACRS and DAD

Question
A researcher has inquired about the manuals for the National Ambulatory Care Reporting System (NACRS) - https://secure.cihi.ca/estore/productFamily.htm?pf=PFC3081&lang=en&media=0) and DAD (Discharge Abstract Database) - https://secure.cihi.ca/estore/productSeries.htm?pc=PCC78.

How might he acquire?  Are they part of our DAD licence? They are hoping to acquire them quickly.

Answer
The abstract and manuals for the Discharge Abstract Database (DAD) are located on the EFT, specifically in the CIHI Safe: /MAD_CIHI_ICIS_DAM/Root/discharge-abstract-database-2015-16/Other_documents. 

Unfortunately the DLI does not have manuals for the National Ambulatory Care Reporting System (NACRS). You would have to contact CIHI for this. I’ve listed their information below:  
CIHI’s Central Client Services team provides support to users of CIHI’s website Monday to Friday (except statutory holidays), from 8 a.m. to 4 p.m. ET.

Thursday, June 22, 2017

Detailed Profiles for Historic Censuses

Question
I have a print-out (or at least a copy of a print out)--not an official publication--from the 1961 Census regarding population counts and characteristics of the Yukon including Whitehorse and Yukon districts. It has a cover sheet saying "Descriptions of individual sets of EA computer print-outs 1961" and includes age groups, language, farm residence, mother tongue, birthplace, ethnic groups, religions, household, labour force including occupation divisions by sex.

Is it possible to generate similar profiles of other geographic regions for this census?

Answer
This question concerned detailed profiles for historic Censuses--based on this wonderful custom print out  from 1963 on the Yukon Territory .

For pre-1971 Censuses, StatCan Library archives can provide scans of print publications but these do not have the level of detail that contemporary profiles have. Also, much of this content is already scanned and is available at statisticscanada.

For 1971, 1981, 1986 and 1991, semi-custom tabulations can be produced for a fee.

Tuesday, June 20, 2017

SPSD/Model

Question
A student is wondering whether or not he could use the SPSD/M in a project.  His deadline is very near and I am under the impression there is a relatively steep learning  steep learning curve associated with the product.  I have not noticed/found any tutorials to give him an idea as to what is involved and how long it might take to learn to use it.

Any comments about your experiences with it will be welcome.

Answer
The DLI hosted a webinar on the SPSDM in the Winter of 2016. The session materials are available on the DLI Training Repository: https://cudo.carleton.ca/dli-training/4078

For more information on the SPSD/M, please see http://www.statcan.gc.ca/eng/microsimulation/spsdm/spsdm

If you have any specific questions on the product, we can contact the SPSDM team here at Statistics Canada on your behalf.

Firm-level Data

Question
A graduate student here needs firm-level business data at a level of detail not found in CANSIM.  The specifics of this request are:

“I am trying to put together a dataset that contains locations (postal codes would be ideal, but something more aggregated could also work) as well as dates of births and deaths of businesses by NAICS code.  I would need the geographic area to include BC and Alberta at least, and the time span to cover 2005-2016.  Also, if the exact dates of births and deaths are not available, then quarterly counts could also work. (I noticed that there are some similar CANSIM tables (552-0005, 553-0005, and 529-0002), but none of these cover the years that I am looking for.)”

Any advice on how to obtain these data – custom tab? RDC?

Answer
Thank you for your question, we are looking into it with subject matter to confirm what options would be available. Here are some things to consider –

Statistics Canada is prohibited by law from releasing any data that would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization.  Specific locations by postal codes would never be available, custom tab or otherwise.

Much of the CANSIM tables are referencing Canadian business counts.  The CBC product provides counts of active business locations on the basis of several variables, such as geography, business activity and employment size. However, it is not advised to use this product for time-series analysis involving comparisons across reference periods.

With respect to RDC access, this is considered business data and thus would only be accessible through the Canadian Centre for Data Development and Economic Research (CDER). CDER provides researchers with direct access to a wide range business and economic microdata files for analytical research. The Centre is located at Statistics Canada's head office in Tunney's Pasture in Ottawa, and operates entirely on a cost-recovery basis.

Followup
Here is the response I’ve received from subject matter regarding your request for firm-level data.

“For the firm level data, we are able to construct files that combine postal codes with information on entry date, firm age and exit date.  However, it is not possible for us to send your student a firm level dataset as this is prohibited by law.  There are provisions in place through the Canadian Centre for Data Development and Economic Research for researchers to access the micro files for research purposes.  They require coming to Ottawa to use the data and typically require a cost recovery charge for the construction of the dataset and any vetting required to ensure respondent confidentiality is maintained.  If this is a preferred option, please see the application process guidelines on the link above.

For published data, please ask your student to look at these tables:  527-0007; 527-0008; 527-0009; 527-0010; 527-0011; 527-0012. They contain estimates for counts, entry, exit, employment growth, employment creation and employment destruction within the provinces and territories.  The data come from the Longitudinal Employment Analysis Program (LEAP),which is different from the tables listed in your question.  The LEAP data are annual, and focus on employer enterprises.  They cover the period 2001 to 2014, and will be updated for 2015 later this year.  If there was a special tabulation desired, that can be looked into as well.”

Monday, June 19, 2017

Numbers of radios and televisions owned by Canadians

Question
A researcher here is looking for data on the numbers of radio and television receivers owned in Canada, particularly during the 1950s, 60s, and 70s. I found some figures in a UNESCO document for the 50s and 60s, but these are only national figures; he'd liked as much geographic detail as he can get. I know that the census lasted asked about television ownership in the 1971 census, and I think there must be some figures in early census years, but I wondered if anyone knew of a compilation of this kind of information. Thanks much in advance!

Answer
The following set of digitized publications from the Dominion Bureau of Statistics provides monthly sales and production statistics for radios broken down by type of radio and by province (except for the Maritimes/Atlantic Canada, which are clustered). It's not the same as the number of radios owned, but might be helpful.

Here is a link to the series record which runs from 1952-1979: 

Title: Radio and television receiving sets
Publication Type: Series
Language: Bilingual-[English | French]
Continues: Radio receiving sets ; Jan. 1949-v. 6, n. 12 (Dec. 1951).
Format: Electronic
NoteL Digitized edition from print [produced by Statistics Canada].
DateL [1952?]-1979.
Chronology: Vol. 7, no. 1 (Jan. 1952)-v. 33, no. 12 (Dec. 1978).

Thursday, June 15, 2017

Release schedule for Aggregate Dissemination Area data

Question
I was trying to work out with some grad students what would be the best level of geography for them to use for the 2016 census. We think that the aggregated dissemination area level will be most useful for what they need. I went to the geography page for the 2016 census, and could download the boundary files for that layer. However, when I went to download age and sex data from http://www12.statcan.gc.ca/census-recensement/2016/dp-pd/dt-td/Lp-eng.cfm?LANG=E&APATH=3&DETAIL=0&DIM=0&FL=A&FREE=0&GC=0&GID=0&GK=0&GRP=1&PID=0&PRID=0&PTYPE=109445&S=0&SHOWALL=0&SUB=0&Temporal=2016&THEME=115&VID=0&VNAMEE=&VNAMEF=, the ADA level was not available for download.

When is it expected that those data will be available, and, more generally, will there be a "normal" lag time on releasing these data (e.g., ADA comes out at the next release)? The students want to work with income data: if the ADA level income file is not released on August 2 with the official release, when should they expect it?

Answer

The geography, ADA, is available in profile format here from the 2016 Census: http://www12.statcan.gc.ca/census-recensement/2016/dp-pd/prof/details/download-telecharger/comp/page_dl-tc.cfm?Lang=E

As the Census releases occur, the profile will be updated with the additional data and variables. Income data from the short form (100% data) will be available on September 13, 2017.

Wednesday, June 14, 2017

Rural/urban breakdown of household disposable income for NB

Question
Can you tell me if I'm overlooking something obvious? I have a grad student who is looking for a rural/urban breakdown of household disposable income for NB. Furthermore, he wants it for every year between 1991 and 2013. Estimates of disposable income by province for the years in question are available (384-0040 and 384-5000), as are a number of neighbourhood tables based on the T1 family file; however, none of these really works for my researcher.

I'm kind of assuming that *if* such a breakdown were available it would be a custom request. Am I overlooking something or letting my assumptions get in the way of effective searching?

Answer
This would have to be a cost recovery custom table.

Since I don't have household on T1FF, the best that could be done with T1FF is Census family disposable income. Disposable income would have to be defined by the client or we could see if there is a standard definition that has been used by other groups within ISD. This would definitively be a custom request. Also for the Urban\Rural we can probably do something but it would be a bit of an approximation.


CIS or Census can do this on Household income as a custom and they may already have defined disposable income. The latest data would be a custom from CIS but I am not sure they could do an urban/rural split at the province level. With Census, you should be able to do it (but then you do not have all the years).


National Accounts can provide disposable income only at the provincial level.

Digital Object Identifiers for StatCan data

Background
Back in 2011 and then again in 2015 there were some questions on the list about DOIs and StatCan data.  At one point it was asked specifically if STC had considered registering DOIs (it was right after DataCite Canada had launched and was being tested) and later [after the Abacus group (SFU, UBC, UNBC, and UVic) made the transition to Dataverse, it was] asked about assigning DOIs and [a] discussion ensued. Now UNB is also going to use Dataverse to offer up research data and were thinking to use it for secondary data as well, but  we have discovered that we have no choice about DOIs. That is, Dataverse is Harvard's resource and they officially indicate that DOIs are assigned. Period. Since Dataverse assigns DOIs automatically, we want to register them (or what's the point?) and are trying to figure out if there is a way that we might be able to force those DOIs to match STC registered DOIs if there were such a thing.

Question 1
If that's not possible, then, the question becomes: what is StatCan's official line on other institutions effectively assigning DOIs to their (STC) data?

Question 2
So to STC employees I would ask a) is Statistics Canada registering DOIs or do they plan to and b) is Statistics Canada concerned about multiple universities (or other organizations) assigning DOIs?  Perhaps we've missed something in the Dataverse documentation, but it really doesn't seem like we have a choice about the assignment of DOIs.

Question 3
My question to the DLI community is how do you deal with this issue (i.e., that Dataverse doesn't give you the choice of whether it assigns a DOI and it doesn't look like we can suppress the info) at your institution?  

Answer 1
  • At UBC, we mint DOIs only for research data, not licensed datasets. 
  • We do not mint DOIs via Abacus Dataverse but via our discovery layer - Open Collections - https://open.library.ubc.ca/, which allows us great flexibility for DOIs minting
  • The newest Dataverse version - 4.6.2 allows to mint handles in addition to DOIs, which might solve the UNB problem. We have collaborated with Harvard to offer that...
    • i.e.: https://dataverse.org/blog/dataverse-462, very new , just released last week or so...So good timing. I was working with Harvard on that for more than a year. Developed by DANS (our Dutch colleagues).More on Github - https://github.com/IQSS/dataverse/milestone/61?closed=1
  • We had to develop our entire DOIs GUI and a pipeline as Datacite Canada was not flexible enough for us. Here is more information - http://researchdata.library.ubc.ca/plan/get-dois/
  • By now we have minted more than 215,000 DOIs for our digital assets (out of 274K in Canada - https://stats.datacite.org/?fq=allocator_facet%3A%22CISTI+-+National+Research+Council+Canada%22&#tab-datacentres)
  • We have assisted multiple schools in their DOIs work, namely uOttawa, BC ELN, Guelph, McMaster, VIU and many more...

Answer 2
We [StatCan] have reached out internally to obtain more information.  Statistics Canada is collaborating with NRC’s DataCite to register DOIs for its aggregate data on the website.  While this will still take some time, progress is underway. I brought forth the concern from the community registering their own DOIs for statcan data. Statcan consulted with DataCite representative that indicated that the current best practice thinking is that multiple DOIs are accepted, as long as they are from different clients. There are good use cases for both registrations, so that another DataCite client with a different prefix will be able to assign a DOI to a copy of Statcan content stored in their repository.  

Informally, I was informed it would be ideal if repositories linked to the official Statcan DOI once available.  As the authors of the data, this would ensure that users are directed to the current and authoritative source.  At some point in the future, they have agreed it would be beneficial to have a conversation with stakeholders of the community.  We, the DLI, are continuing to have conversations with internal stakeholder regarding the potential for registration of items, such as PUMFs.
[StatCan] is interested to learn more about the communities perspectives, please share your comments on the list.


Further comments
Hi folks, I’m attending the Dataverse Community Meeting this week, the release which was discussed below allows support for Handles OR DOIs as the persistent identifier for your Dataverse instance, not both (if I’m clear on this, haven’t actually tested it yet). In our case, I believe we would want to support the option for selecting either a handle or doi on a dataset by dataset basis in the same DV instance  – see this use case ticket here: https://github.com/IQSS/dataverse/issues/3623

I’m also interested in coming up with a coordinated solution for registering DOIs for STC data, including aggregate data available to us via the DLI. I’m happy that STC and DLI may be assigning DOIs in the near future, which is a step forward. We’ve discussed this at length within the OCUL community and I hope it can be a topic of discussion at our national training next year in Montreal (I’m assuming this is happening). 

Here are some options we’ve explored/discussed:
  • publishing STC data w/ DOIs (argument that these are our access copies; according the DataCite BP)
  • publishing STC data w/ DOIs but unregistering these DOIs using the DataCite API
  • publishing STC data w/ Handles (however, not technically possible in Dataverse yet)
  • publishing STC data w/ internal Dataverse identifier (same as above)

I have some questions for other folks in Canada that might be helpful for our conversations…If we were to coordinate loading of STC data w/other Canadian universities 
  • what data are you loading?
  • can we harvest one repository? 
  • can we harvest DLI’s repository? 
For now our PUMFs will remain in our Nesstar repository, but we are in the process of releasing all non-PUMFs in our SP Dataverse. 

Further comments
The issue of duplicate DOIs is becoming a concern not only in our environment but elsewhere as well. At least, STC is now considering assigning them to aggregate data – a first step. It was also good to hear what is [happening] out west as this is an issue we have to consider for ODESI and the Scholars Portal Dataverse. There should be some interesting information coming out of  the DataVerse conference – I look forward to hearing about it from anyone else who is attending.

Tuesday, June 6, 2017

Canada Year Book Deposit to publications.gc.ca

Question
Can you advise when the Canada Year Book collection will be migrated to publications.gc.ca (formerly Depository Services Program)?

Answer
A response from subject matter:  “The DSP can’t confirm when they will have them loaded on their site but they will prioritize them. The Library is coordinating the transfer of the files to the DSP and they should have all the files by middle of next week. Then they will have to review them for problems, etc. before they catalogue and post them.”  

Canada Year Book

Question
This is only marginally DLI-related, but I have a researcher who wants access to historic copies of the Canada Year Book. They’re archived at https://www66.statcan.gc.ca/acyb_000-eng.htm but they aren’t downloadable expect by single pages. I’ve searched the directory of STC pubs on the U of A mirror, but came up empty. Is there another way to get access to these?

I’ve found some at the Internet Archive, but not far enough back to span the period in which he’s interested (1867-1919).

Answer
Thank you for your patience for this request. The DLI team has uploaded the Canada Year Book collection to the Alberta Mirror site. Files exist in PDF; for those wishing to access the CYB collection on the University of Alberta DLI Mirror Site, point your (secure) ftp program (I use Filezilla) to: sftp://dliftp.library.ualberta.ca; pwd = the same pre-EFT global password as used by DLI contacts for the Mirror Site. The CYB collection is at the top of the DLI Directory.

Followup: June 20/17
Status Update: Digitization of Statistics Canada Historical Collection
The digitization phase of the Statistics Canada Library’s multi-year project to digitize the entire collection of print-only, official Statistics Canada/DBS publications was completed in March 31, 2016.  As part of the Open Information Initiative, the Library has shared the digitized collection with Publishing and Depository Services Directorate (PDS) who is making the information available through the Government of Canada Publications catalogue.  To date, approximately 85,070 of the 136,630 files have been catalogued and made available. This initiative will continue over the coming months until all files are catalogued and available.

Recently the DLI had made the digitized CYB collection available on the University of Alberta hosted Mirror site.  Publications and Depository Services have completed cataloguing and publishing the Canada Year Book collection (1867-1967):

CS11-202E-PDF
1867
http://publications.gc.ca/pub?id=9.838179&sl=0

CS11-202E-PDF
1868-1879
http://publications.gc.ca/pub?id=9.838182&sl=0

CS11-202E-PDF
[1st year of issue] (1886 [i.e. 1885])-4th year of issue (1888)
http://publications.gc.ca/pub?id=9.838184&sl=0

CS11-202E-PDF
5th year of issue (1889)-20th year of issue (1904)
http://publications.gc.ca/pub?id=9.838186&sl=0

CS11-202E-PDF
1906-
http://publications.gc.ca/pub?id=9.838163&sl=0

Monday, June 5, 2017

Definition of "Suburb"

Question
A researcher is asking for the definition of suburb. A few articles, such as http://www.statcan.gc.ca/daily-quotidien/170503/dq170503a-eng.htm mention suburbs, but I am not finding a definition of it in the Census Dictionary or elsewhere and certainly I have never seen it mentioned as a standard geography.

Is there a definition? Is it possibly a “popular” term which might be a MIZ?

Answer
The Geography division has confirmed that there is not an “official” definition of suburb. You are correct that this is not a standard geography. In the example provided, it seems they are using CT’s to “build” the suburbs.

Additional Resources This article by Prof. David Gordon at Queen's might provide some insights: http://www.canadianurbanism.ca/wp-content/uploads/2014/07/CanU%20WP1%20Suburban%20Nation%202006-2011%20Text%20and%20Atlas%20comp.pdf

Followup to this question
"We recently had a request come in from a DLI Researcher regarding the exact definition of the term «suburb», which seems to be recurring in many StatCan articles. It seemed to be considered by many as a “popular” term as opposed to being an official level of geography. For those interested, here are some clarifications from Subject Matter which may be useful in the future:

With help from the Geography division, the Demography division have on many occasions distinguished in its analytical documents linked to census (2006, 2011 and 2016), the city centres and the suburbs. This classification is not an official standard geography but serves our need to produce meaningful and easy to understand analysis to the general public. The city centres was then defined as the municipalities (CSD) giving its name to the CMA. For some reason, this explanation was not clearly expressed in our latest release. For example, in the Montreal CMA, the city centres would be defined as being the Montreal CSD and the suburb everything outside this CSD within the Montreal CMA."

Thursday, June 1, 2017

Age and gender for population centres Toronto and Calgary

Question
I have a researcher who is looking for annual data broken down by single years of age and sex for the population centers of Toronto for the years 2011 to 2015 and Calgary for the years 2008 to 2013.  He’s tried CANSIM and I’ve looked through various other data sources, but couldn’t find the requested data for these population centers. Would he have to request a custom tabulation?

Answer
“Unfortunately, we do not have single years of age by population centers (or urban areas in 2006 and earlier) in any of our standard product line.  Also, we only have data for the 2001, 2006, 2011 and the 2016 Censuses.  For annual data you would need to contact the Demography division for the population estimates.

Data for 2001, 2006, 2011 and 2016 can be ordered via the nearest regional office as a custom data tabulation.”