Brian Tamanaha’s Straw Men (Part 1): Why we used SIPP data from 1996 to 2011

Fool on NYU Abu Dhabi closes its campus (for the time being I assume)April 1, 2026
My former colleagues at another university in Middle East have also been moved to online teaching indefinitely, with the students…
Santa Monica on LLMs and graduate education in philosophyApril 1, 2026
If much of the interest of high-quality papers lies between the lines—in the metaphorical fire that a paper lights in…
Charles Bakker on Deciding Between Admissions Offers: The Importance of Visiting/Talking With Current StudentsApril 1, 2026
I would also recommend that potential grad students make inquiries into how far the compensation package actually goes towards cost…
Matty Silverstein on NYU Abu Dhabi closes its campus (for the time being I assume)April 1, 2026
It’s a mix. I’m still in the UAE with my family, and we feel safe. But some students and faculty…
Jason on LLMs and graduate education in philosophyApril 1, 2026
In the above comment, Michel wrote: “As an aside, every once in a while I check out how the chatbots…
Nathan Meyvis on LLMs and graduate education in philosophyMarch 31, 2026
I could imagine LLMs having saved me a *ton* of time in graduate school–e.g., by having supplied reasonable answers to…
Stefan Sciaraffa on In Memoriam: Barry Allen (1957-2025)March 31, 2026
The McMaster Department of Philosophy has now put together the following notice commemorating Barry: Barry Allen: A Philosophical Life Barry…

Brian Tamanaha’s Straw Men (Part 1): Why we used SIPP data from 1996 to 2011

7/24/2013

Michael Simkovic

BT Claim:
We could have used more historical data without introducing continuity
and other methodological problems

BT quote: “Although SIPP was redesigned in 1996, there
are surveys for 1993 and 1992, which allow continuity . . .”

Response:
Using more historical data from SIPP would likely have introduced
continuity and other methodological problems

SIPP does indeed go
back farther than 1996. We chose that
date because it was the beginning of an updated and revitalized SIPP that
continues to this day. SIPP was
substantially redesigned in 1996 to increase sample size and improve data
quality. Combining different versions of
SIPP could have introduced methodological problems. That doesn't mean one could not do it in the
future, but it might raise as many questions as it would answer.

Had we used earlier
data, it could be difficult to know to what extent changes to our earnings
premiums estimates were caused by changes in the real world, and to what extent
they were artifacts caused by changes to the SIPP methodology.

Because SIPP has
developed and improved over time, the more recent data is more reliable than
older historical data. All else being
equal, a larger sample size and more years of data are preferable. However, data quality issues suggest focusing
on more recent data.

If older data were
included, it probably would have been appropriate to weight more recent and higher
quality data more heavily than older and lower quality data. We would likely also have had to make
adjustments for differences that might have been caused by changes in survey
methodology. Such adjustments would
inevitably have been controversial.

Because the sample
size increased dramatically after 1996, including a few years of pre 1996 data
would not provide as much new data or have the potential to change our
estimates by nearly as much as Professor Tamanaha believes. There are also gaps in SIPP data from the
1980s because of insufficient funding.

These issues and the
1996 changes are explained at length in the Survey of Income and Program Participation User’s Guide.

Changes to the new
1996 version of SIPP include:

Roughly doubling the sample size
- This improves the precision of estimates and
  shrinks standard errors
Lengthening the panels from 3 years to 4 years
- This reduces the severity of the regression
  to the median problem
Introducing computer assisted interviewing
to improve data collection and reduce errors or the need to impute for missing
data
Introducing oversampling of low income neighborhoods
- This mitigates response bias issues we previously discussed, which are most likely to affect the bottom
  of the distribution
New income
topcoding procedures were instituted with the 1996 Panel
- This will
  affect both means and various points in the distribution
- Topcoding is done on a monthly or quarterly
  basis, and can therefore undercount end of year bonuses, even for those who are
  not extremely high income year-round

Most government
surveys topcode income data—that is, there is a maximum income that they will
report. This is done to protect the
privacy of high-income individuals who could more easily be identified from
ostensibly confidential survey data if their incomes were revealed.

Because law
graduates tend to have higher incomes than bachelor’s, topcoding introduces
downward bias to earnings premiums estimates. Midstream changes to topcoding
procedures can change this bias and create problems with respect to consistency
and continuity.

Without going into
more detail, the topcoding procedure that began in 1996 appears to be an
improvement over the earlier topcoding procedure.

These are only a
subset of the problems extending the SIPP data back past 1996 would have introduced. For us, the costs of backfilling data appear
to outweigh the benefits. If other
parties wish to pursue that course, we'll be interested in what they find, just
as we hope others were interested in our findings.

Guest Blogger: Michael Simkovic, Legal Profession, Of Academic Interest, Science, Weblogs

Leiter Reports: A Philosophy Blog

Brian Tamanaha’s Straw Men (Part 1): Why we used SIPP data from 1996 to 2011

Like this:

Leiter Reports: A Philosophy Blog

Brian Tamanaha’s Straw Men (Part 1): Why we used SIPP data from 1996 to 2011

Share this:

Like this: