Tuesday, November 18, 2008

Something About a Small Sample Size

So I haven't posted much of late because grad school keeps whooping my behind, and, well, there's not a whole lot that medieval history and baseball do seem to have in common.

However, the other day, my professor did bring up a very interesting point:

In medieval history, statistics are often based on a small sample size (or what we baseball peoples would consider a small sample size), simply because it is the only sample available.

For example, during the Black Death of 1348-1350, the mortality rtes are got at by looking at monastic records, because they're the only such records that we can trust. The problem, obviously, is that monasteries are not representative of the population as a whole, especially given their living conditions and the proximity they would likely have had near plague victims.

It would be similar to looking at A-Rod's stats while facing pitchers like Santana and Sabathia, to the exclusion of all other pitchers.

Small sample size is the bane to stats geeks everywhere. How do you define a small sample size? When do you cross the line from small sample size to this is what a player really is? As with anything to do with statistics, sample sizes can be melded to fit whatever purpose a writer, blogger or analyst has in mind.

Yet in history--medieval history, especially--there is no choice but to use the small sample.

It will drive you crazy if you think about it too much (I know from experience), but it's still an interesting juxtaposition. In baseball, half the time we take only the stats we need to make our point...and in medieval history, we only wish there were more available.

No comments:

Post a Comment