When big data are bad data
As archaeologists turn increasingly to the analysis of large, systematic databases, we need to confront an epistemological problem: How do we identify bad data, and what can we do about it? Economic historians and others are becoming consumers of archaeological data, and they are quick to jump on new databases. They seldom ask about the quality of the data, and this can result in sophisticated analyses of bad data. But, as we all know, “Garbage in, garbage out.” I blogged about this a couple of years ago in reference to Tertius Chandler’s list of city sizes through history , from both archaeological and historical sources ( Link is here ). Those data (Chandler 1987) are considered shockingly bad and worthless by most historical demographers and historians. In technical terms, they may be "bullshit" ( see my post on bullshi t). Yet some urban scholars merrily use the data for studies today. I consider this a real problem, and said so in my review of a manuscript for a journal ...