Thursday, November 17, 2005

CANSIM (CHASS version from U of T) and the Statistics Canada web interface

Question

When I go to CHASS' CANSIM II homepage and click on:

Browse CANSIM II by subjects ---- Trade ---- Retail trade

I get a listing of 24 tables.

When I browse the same subject list on Statistics Canada's CANSIM webpage (or even on the CANSIM E-Stat page) I get 30 tables.

Why?

Answer

All the tables that you may not see when browsing by subject are there and can be accessed by numbers.

This is how it works:

Each Tuesday CHASS receives all new and updated tables and series from Statcan. (Not long ago we ran a full database scan to compare all series and tables at CHASS and at Statcan, to be sure that we do not miss any data.) . New tables (with all their series) are loaded, and table listings are accordingly updated. All new series that belong to existing tables are also loaded. All updated series are handled by each individual record being updated as per the latest data.

However, when tables are added to the database, Statcan does not send us new themes/subjects matrix and that's where the discrepancy creeps in. So, all data are there, and keyword searches work, but themes/subjects listings may start diverging. In some cases it is annoying, in others it was sometimes equally annoying even when we had fully compatible themes/subjects listings. I recall when earlier this year I was running a seminar in New Delhi on large database organization for health care indicators, and used - as an example - Cansim by subject, extracting various series based on their health related subjects - and how frustrating an exercise it was, with classifications being sometimes totally wacky. Keyword searches were much more reliable and produces many more series related to what I was looking for, even though the same "subjects" formally existed.

Having said that, there is no excuse for the discrepancy, and we must work with Statistics Canada to resolve the issue. Ideally, we should be getting updated concordance tables with every new table addition/deletion or change in subject classification. Realistically, such an update once a month would probably suffice.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.