Wednesday, September 28, 2016

Revised and/or enhanced Discharge Abstract Database files for 2014-2015

With the kind cooperation of the DLI group, I have uploaded a zip file containing revised geographic and clinical files for the 2014-2015 CIHI DAD onto the DLI EFT site, into the /MAD_CIHI_ICIS_DAM/Root/discharge-abstract-database-2014-15/data/Revised folder. I hope that the community finds them useful.

About the revisions and/or enhancements to the 2014-2015 DAD files:

The revised data files have data transformed (where appropriate) from string variables (e.g., as received, age may be recorded as a string of "Under 1 year") to coded variables (restoring age to an ordinal variable). It also restores the proper CCI encoding, reinserting the punctuation which is omitted from the file received from CIHI, making it possible for the user to match the data stored in the file to the documentation.

As distributed by CIHI, there are 25 ICD10 variables for diagnoses, and 20 variables for type of intervention. In addition to those variables, the revised data file also contains 245 ICD10 and 190 CCI flag variables - these identify which records contain particular codes across any of the multiple ICD10 or CCI variables.

For example, consider the constructed variable

CCIF040 "Diagnostic Interventions on the Nervous System (2AA - 2BX)"

If any of the 20 CCI variables (I_CCI_1 through I_CCI_20) in a record contain a code that begins with the string 2AA through 2BX, that record will be flagged in variable CCIF040 as true: a "Diagnostic Intervention on the Nervous System" was reported in at least one of the 20 CCI variables.

Similarly, consider the constructed variable

ICDF127 "Reported K40-K46: Hernia"

If any of the 25 ICD10 diagnosis variables (D_I10_1 through D_I10_25) in a record contain an ICD10 code that begins with the string K40, K41, K42, K43, K44, K45, or K46, that record will be flagged in variable ICDF127 as true: a Hernia was reported in at least one of the 25 ICD10 variables.

Additionally, the revised data files contains counts of the number of diagnoses and interventions for each record on the file. This enables the user, for example, to distinguish "simple" cases (e.g., one diagnosis or intervention) from complex (however many the user defines as complex).

The zip file contains ASCII (.rev) and SPSS system file (.sav) versions of the revised data files. It also contains the two SPSS programs which are used to reformat the clinical file, and the SPSS program used to reformat the geographic file. It also contains the log files (in both ASCII text format and SPSS spv format) created from running the SPSS programs, and the frequencies of each variable (including the constructed flag and count variables). The frequencies of the 25 ICD10 diagnosis are contained in a single text file; the 20 CCI variables are contained in a second text file. The log files contain a write statement, which shows the record layout of the ASCII versions of the revised data files (hence the .rev extension).