Leiter Reports: A Philosophy Blog

News and views about philosophy, the academic profession, academic freedom, intellectual culture, and other topics. The world’s most popular philosophy blog, since 2003.

  1. Fool's avatar
  2. Santa Monica's avatar
  3. Charles Bakker's avatar
  4. Matty Silverstein's avatar
  5. Jason's avatar
  6. Nathan Meyvis's avatar
  7. Stefan Sciaraffa's avatar

    The McMaster Department of Philosophy has now put together the following notice commemorating Barry: Barry Allen: A Philosophical Life Barry…

Brian Tamanaha’s Straw Men (Part 1): Why we used SIPP data from 1996 to 2011

BT Claim: 
We could have used more historical data without introducing continuity
and other methodological problems

BT quote:  “Although SIPP was redesigned in 1996, there
are surveys for 1993 and 1992, which allow continuity . . .”

Response: 
Using more historical data from SIPP would likely have introduced
continuity and other methodological problems

SIPP does indeed go
back farther than 1996.  We chose that
date because it was the beginning of an updated and revitalized SIPP that
continues to this day.  SIPP was
substantially redesigned in 1996 to increase sample size and improve data
quality.  Combining different versions of
SIPP could have introduced methodological problems.  That doesn't mean one could not do it in the
future, but it might raise as many questions as it would answer.

Had we used earlier
data, it could be difficult to know to what extent changes to our earnings
premiums estimates were caused by changes in the real world, and to what extent
they were artifacts caused by changes to the SIPP methodology. 

Because SIPP has
developed and improved over time, the more recent data is more reliable than
older historical data.  All else being
equal, a larger sample size and more years of data are preferable.  However, data quality issues suggest focusing
on more recent data. 

If older data were
included, it probably would have been appropriate to weight more recent and higher
quality data more heavily than older and lower quality data.  We would likely also have had to make
adjustments for differences that might have been caused by changes in survey
methodology.  Such adjustments would
inevitably have been controversial.

Because the sample
size increased dramatically after 1996, including a few years of pre 1996 data
would not provide as much new data or have the potential to change our
estimates by nearly as much as Professor Tamanaha believes.  There are also gaps in SIPP data from the
1980s because of insufficient funding.

These issues and the
1996 changes are explained at length in the Survey of Income and Program Participation User’s Guide

Changes to the new
1996 version of SIPP include:

  • Roughly doubling the sample size
    • This improves the precision of estimates and
      shrinks standard errors
  • Lengthening the panels from 3 years to 4 years
    • This reduces the severity of the regression
      to the median problem
  • Introducing computer assisted interviewing
    to improve data collection and reduce errors or the need to impute for missing
    data
  • Introducing oversampling of low income neighborhoods
  • New income
    topcoding procedures were instituted with the 1996 Panel

    • This will
      affect both means and various points in the distribution
    • Topcoding is done on a monthly or quarterly
      basis, and can therefore undercount end of year bonuses, even for those who are
      not extremely high income year-round

Most government
surveys topcode income data—that is, there is a maximum income that they will
report.  This is done to protect the
privacy of high-income individuals who could more easily be identified from
ostensibly confidential survey data if their incomes were revealed.

Because law
graduates tend to have higher incomes than bachelor’s, topcoding introduces
downward bias to earnings premiums estimates. Midstream changes to topcoding
procedures can change this bias and create problems with respect to consistency
and continuity.   

Without going into
more detail, the topcoding procedure that began in 1996 appears to be an
improvement over the earlier topcoding procedure.

These are only a
subset of the problems extending the SIPP data back past 1996 would have introduced.  For us, the costs of backfilling data appear
to outweigh the benefits.  If other
parties wish to pursue that course, we'll be interested in what they find, just
as we hope others were interested in our findings.

, , , ,

Designed with WordPress