Working with biased data to understand the past | Archaeology

The ancient Greek philosopher Heraclitus, one of the great-great-grandfathers ofmodern science, wrote that “nature loves to hide.” In other words, understanding the workings of the natural world is hard.

As scientists, we ask questions about the world, collect data that we think are relevant toanswering those questions, analyze the data, and propose tentative answers.

It sounds simple but it’s not, partly because nature loves to hide.

More:Kentucky study sheds light on Ohio's Ice Age American Indians | Archaeology

Matthew Purtill, an archaeologist with the Department of Geology and Environmental Science at the State University of New York at Fredonia, considers how sampling bias can hide data that are crucial for answering all sorts of questions. His paper was published online in September in the Journal of Archaeological Method and Theory.

Brad Lepper, the senior archeologist for the Ohio History Connection's Wild Heritage Program
Brad Lepper, the senior archeologist for the Ohio History Connection's Wild Heritage Program

Sampling bias refers to the difference between the total number of things you’re lookingfor (such as flint spear points from a particular time period) and the number that you’ve actually found. Obviously, you can never find every flint spear point in Ohio, but if your sample is broadly representative of all the places where nature has hidden such things, then you can make reliable generalizations about where the makers of the spear points chose to live and hunt.

But here’s where it gets hard. How can you know if your sample is even remotely representative?

Purtill mentions several forms of bias that can make it hard to get a representative sample. For example, you are more likely to find spear points in plowed fields because the plow can drag buried artifacts to the surface, making them easier to find.

Some parts of Ohio have more plowed fields than others. So if those regions with intensive plowing have more spear points of a particular type than other regions, does that mean that this ancient Indigenous culture preferred that region over others, or could it be that those kinds of points actually are in those other regions but simply haven’t yet been found?

More:Ancient Hopewell artifacts linked to historic American Indian traditions | Archaeology

We have lots of data that can reveal who lived where in ancient Ohio. The Ohio Archaeological Inventory is a record of all the documented archaeological sites in the state. If we want to know where a particular ancient Indigenous culture preferred to live, we could just plot all the documented sites where their spear points have been found and have our answer. Except that we know the sample of documented sites is biased by plowing and many other factors.

Purtill proposes a couple of more or less simple ways to deal with this problem. One approach involves using the absence of data from one time period “to evaluate if the absence of data from a second period reflects cultural decision making or modern bias.”

He demonstrates how this works by comparing the distribution of sites from three time periods: Paleoindian (11,000 to 8,000 BC), Early Archaic (8,000 to 6,000 BC), and Late Archaic (3,000 to 1,000 BC). Using various statistical analyses, he found that Paleoindian sites were significantly over-represented relative to Early Archaic sites in the southern Ohio Bluegrass region. He interpreted this as evidence for cultural decision-making that resulted in “a shift away from established upper Ohio River-to-Midsouth cultural ties” during the Paleoindian period “to increased regionalization along the Ohio River during the Archaic.”

Nature may love to hide, but science provides us with the tools that can reveal her hidden secrets.

Brad Lepper is the senior archaeologist for the Ohio History Connection’s World Heritage Program 

blepper@ohiohistory.org

This article originally appeared on The Columbus Dispatch: Column: Science provides tools to reveal nature's hidden secrets