Monday, February 9, 2009

Frequencies in the NLSCY

Question

I was just getting the NLSCY files for a graduate student here, and loading them into SPSS. As taught in the training sessions, I run a frequency with each file to see that it loaded correctly. In the course of doing this I encountered a couple of discrepancies.
Cycle 2, 10-13 file: I run a frequency and get a total of 4145 records. This matches the DLI web page, but the codebook indicates that there are 4498 respondents.
Cycle 3, 10-15 file: I run a frequency and get a total of 5539 records. Once again, this matches the DLI web page, but the codebook indicates that there are 6380 respondents.
Given that it is Friday afternoon after a long and busy week, it is quite possible that I am missing something obvious, and if I had to guess, it would be something about the weights. However, for the grad student's peace of mind, I thought that it would be prudent to get the explanation from a more reliable source.
Answer

The author division confirmed that the number of records you are getting is correct. The larger number indicated in the codebooks apply to the master files, not the PUMF. Here is the verbatim response from the division:

"there is a discrepancy between the number of cases in the codebook and the pumf c2 self-complete file (10 to 13). There are no cycle 2 or cycle 3 pumf codebooks, the only codebooks available are ones based on the master file counts and frequencies. I have 4,145 records on the pumf c2 self-complete versus 4,498 records in the self-complete codebook. The number of cycle 3 records on the self-complete pumf is 5,539."

I hope this is clear and thanks for bringing it to our attention.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.