Monday, July 31, 2006

Number of first year university enrollments


I have been asked for the number of new 1st year students in Canadian universities by gender. Does anyone know where to get this information?

Cansim provides overall enrollment, but not by year.


I just confirmed with the author division and this information is not available from Statistics Canada. Apparently we once did collect it, but it is not longer being collected at this time.

Updated Products - SPSDM / RTSS

Social Policy Simulation Database and Model - Version 14.1

The Social Policy Simulation Database and Model (SPSD/M), Version 14.1, based on 2002 microdata is now available. The most recent SPSD/M can be used to study the impacts of changes to federal and provincial tax and benefit programs on families and governments from 1991 through 2010.

The Daily:

FTP: /dli/spsdm/spsdm-v14.1

Residential Telephone Service Survey 2005 - 12

** An error in processing mistakenly assigned several CMAs to the incorrect urban size code for the variable SZCODE.
The error had a significant impact on the value for size code 4, the differences for the other categories are negligible.

The following files have been replaced:

RTSS Dec 2005 Data and Codebook

FTP: /dli/rtss/2005/12

Friday, July 21, 2006

Farmland within Toronto Municipal Boundaries


I have a graduate student looking for 'Total Hectares of farmland within Toronto Municipal Boundaries for 1991,1996 and 2001'. The Census of Agriculture does not have these figures for the Municipal level. Can any one suggest where I should look?


The lowest level of dissemination for standard products for agricultural tables are CCS (Census Consolidated Subdivisions). For the Census Division of Toronto, 2001 has one CCS (Toronto) and for the other years (prior to amalgamation in Toronto), there are more CCS listed (six I think).

You can access the most recent information for free on the Census of Agriculture web site
and the previous years are available from the DLI FTP site.

If you need smaller geographies, these will be a custom request from the Division and will result in fees for the work.

Friday, July 14, 2006

Student debt by urban/rural breakdown


The students are hoping to find data on student debt loads of rural and urban post-secondary students. They want the data based on whether students classify themselves to be of rural or urban origin.

Any suggestions? They have looked at CANSIM and beyond


Unfortunately we do not collect this information by rural and urban origin.

Question about ethnic origins in the census


One of my Faculty Members has a question about people who list multiple ethnic origins in the census. He has specifically asked that I forward his question to those who are knowledgeable about the Census.

His question:
"I am wondering whether the multiple origins option results in duplication, or whether some method is used to prevent this. For example, if an informant reports that her origins are French and Aboriginal, does she appear in the aggregate data twice, once as 'French, multiple origins' AND a second time as 'Aboriginal, multiple origins'? If she would only be aggregated into one of these categories, how is the determination made? I would like to have an answer to this that goes back to the introduction of multiple origins on the census (is that 1981?)."


I wonder if this client is referring to a variable in the PUMF file? If the PUMF contains a transformed ethnic origin variable from a multiple response (a variable for which the answer of a person can match one or more valid response(s)) to a standardized variable (a variable for which the answer of a person can match one and only one valid response), then there would be no issue of double counting because the category of French and Aboriginal origins refers to a single category -- and if the respondent had French and Aboriginal origins, that person would only show up in the data once in this category. For this type of variable, a person is matched to one and only one category of the variable and it is possible to add up the categories to obtain the total population. (Look at the variable "DethNic" in CAPSS or the e-dict).

However, if the client is simply referring to the ethnic origin variable as it is disseminated in standard tables on the Internet site, then there is indeed double/triple etc. counting occurring. You can not add up all of the ethnic origins to obtain the total population because people will be included in each and every category that they reported their ethnic origin to be, the sum would thus be greater than the total population count. For example, if the person reported French and Aboriginal origins, the person will be included in the total French origins as well as the total Aboriginal origins. That is why we have the Total, single and multiple indicators of the ethnic origin variable.

The ethnic origin variable is one of a few "multiple response" variables in the Census. (Other multiple response variables are Mother tongue, home language or population group). A multiple response variable is a variable for which the answer of a person can match more or more valid responses. Hence a person will match more than one category if that person has answered more than one response category to that question. The sum of all ethnic origin response categories will not add to the total population because respondents were permitted more than one response. We do not do any aggregations into one ethnic origin category if the respondent reported multiple origins. We could create single and multiple response categories to transform the ethnic origin variable from a multiple response variable to a standardized variable, but even in this transformation, we do not choose one or the other category to create the final categories, rather, we group the types of responses into similar categories. (see the

I suggest that the client consult the 2001 Census ethnic origin user guide for more information on the variable.

High school graduates in the 1930s


Would anyone know where to find actual numbers of high school graduates in the 1930s? We have a student looking for this information. He is looking for Canada and by province if possible.


The Education Division's information dates back to the 1970s at the earliest.

Thursday, July 13, 2006

The Inter-Corporate Ownership 2006-Q2 (ICO)

The Inter-Corporate Ownership 2006-Q2

The Inter-corporate ownership directory is the most authoritative and comprehensive source of information available on corporate ownership, a unique directory of "who owns what" in Canada. It tracks the ownership of the largest Canadian corporations and provides up-to-date information reflecting recent corporate takeovers and other substantial changes. Ultimate corporate control is determined through a careful study of holdings by corporations, the effects of options, insider holdings, convertible shares and interlocking directorships. The information that is presented is based on non-confidential returns filed by Canadian corporations under the Corporations Returns Act. The Inter-corporate ownership directory now lists more than 95,000 corporations.

The data are presented in an easy-to-read tiered format, illustrating at a glance the hierarchy of subsidiaries within each corporate structure. The entries for each corporation provide both the country of control and the country of residence. As well, the inclusion of the Standard Industrial Classification code enables study by industrial sector.



Wednesday, July 12, 2006

CCHS 2.2 Synthetic Files


1. When will the CCHS 2.2 synthetic files be available?

2. The userguide (page 20, section 5.7, second paragraph) states:

"The reader should however note that the variable "aboriginal status" has been removed from the Public Use Microdata File for confidentiality purposes. This variable can only be accessed through the use of the master file which resides at Statistics Canada and in the Regional Data Centres or through the use of the share file in the provincial Ministries of Health, at Health Canada and at the Public Health Agency of Canada."

What is the share file? Does the researcher have access through the Saskatchewan Dept. of Health? Please note it is the aboriginal status variable that the researcher requires.

3. In the userguide (page 69, section 12.4, final paragraph) remote access service is described. We are wondering if this exists when there isn't (yet) a synthetic file?

Furthermore, the paragraph refers to the user as the 'purchaser'; are we correct in assuming DLI users are 'purchasers' via the DLI licence. As you know, there are no RDCs in Saskatchewan.


Because of the number of files resulting from the 2.2 survey (7 files) it is not possible, at this point, to create a synthetic files like it was done for the other surveys. Users would have to do some linkages between the 7 files and the chance of encountering problems and errors is too high. Methodologists have been asked to look at a possible strategy on what can be done in this case. This means, for the time being, that it is not possible to access these files. We will update you as soon as we get news from Methodology to let you know if we will be able or not to resolve that issue.

The share file is the file that we provide to the Provincial Health Department and Health Canada with the permission of the respondents(records from respondents who refuse to have their data shared with specific partners are taken out of the share file and appears only on the master file). Researchers are not given access to the share file in Health Departments. It's not part of the sharing agreements and the only way for a researcher to access the master file is through remote access or by presenting a project through the RDCs. Because there is no synthetic file at this point for 2.2, the remote access does not exist. We certainly hope methodologists do find a solution to that issue but until then, there is no remote access to 2.2.

Yes, through the DLI license, the DLI users are purchasers and yes that variable "aboriginal status" has been removed for confidentiality purposes and this is after discussions with the Micro data Release Committee who ensure that confidentiality is not threatened in any way.

Monday, July 10, 2006

Historical stats on religion and ethnicity


A religion grad student at Concordia is looking for the following: population of Jews and Arabs in Canada as a whole and in the Montreal CMA (or, if not available, for the province of Quebec) for the 1901-2001 period. She would also like to have the number of Muslims for the same geographic areas (Canada and Montreal CMA or province of Quebec) and the same period.

I suppose it would be possible to gather data from the individual census but the student does not have much time to do that and would much prefer ready-made historical statistic. Unfortunately, I suspect those do not exist for the categories mentioned above. The only historical data I have been able to find so far is the Jewish population (CANSIM Table 075-0016).


For some of the years there was no data for certain categories and there was no data for Montreal. StatsCan does not have any publication for prior to 1971. I will be contacting subject-matter for any additional information, but I know from past experience they also do not have historical data at their fingertips unless they have done an historical study.

Friday, July 7, 2006

Postal Code Conversion Files for all years


Is there a complete set of the PCCFs available on the ftp site or elsewhere? Specifically a researcher needs the PCCF 2000 - on to link to medical data identified by postal codes and time.


The file you want is pccf_nov00 on the dli ftp site.


Wednesday, July 5, 2006

Mill/firm level data of softwood lumber


I have a PhD student from UoT looking for mill/firm level data on production costs( regeneration costs, harvesting costs, transportation costs, etc.) of softwood lumber in Canada. Where could I get it? Is there anyone in charge of the StatCan forestry sector?


Please see the following product:

Definition of bootstrapping weighting


Could someone please provide a definition of bootstrap weighting, and also why you would choose to use this type of weighting?


Here is an article on bootstrapping from ICSPR.