//
you're reading...

Data Visualization

Tabling Advice

How to Make Truly Terrible Tables: A Tutorial by David Streiner

Part I: Be Too Accurate

 

No, your eyes are not deceiving you; the title of the blog has changed slightly, from “How to Make Truly Terrible Graphs” to “How to Make Truly Terrible Tables.” This reflects the fact that it is possible to screw up (am I allowed to say that? Let’s make it “have things go amiss”) in areas other than graph-making. So, in the next few blogs, we’ll turn our attention to making tables for papers and presentations. (As a woodworker, I’ve screwed up making other types of tables, but that discussion will have to wait for a different forum.) The second part of the title may also raise some eyebrows; how can you be too accurate? After all, the need for accuracy has been drummed into our heads since we were scientists-in-training, learning the rules of the game at our supervisor’s knee. Whether we’re using an extremely expensive piece of lab equipment or designing a new paper-and-pencil scale, the mantra is the same: reduce the error in order to improve the reliability of our measurements and increase the accuracy. So in presenting our results in a table, how can we be “too accurate?”

As a matter of fact, it’s actually quite easy; all we have to do is ignore the imprecision inherent in any measurement and just keep printing out all of those numbers to the right of the decimal point. For starters, let’s take a look at Table 1, presenting some basic demographic information for a group in a study.

 

Table 1

Demographic Information

Variable

Group 1

Group 2

Number of males/females

6/4

5/5

Age in Years (SD)

38.25 (10.05)

37.60 (9.90)

Education in Years (SD)

13.45 (4.20)

12.90 (4.15)

Starting off with Age, we report that it’s 38.25 years for the 10 people in Group 1. If we determined age by asking the people how old they were at their last birthday, then on average, there’ll be an error of about 180 days. For example, at my last birthday, I was 73 years old, but I’m actually 73 years, 8 months, and 12 days old on the day that I’m writing this. (For those who want to send cards or presents, my actual birth date is 12 November; my mailing address is available on request.) We can improve the accuracy by asking people how old they are as of their nearest birthday, but that decreases the error to “only” 90 days, on average. Now, just what does that ‘5’ in the second decimal place represent? It’s 1/100th of a year, or 3.65 days. Given the degree of inaccuracy in how we measured age to begin with, can we really justify this degree of accuracy in reporting the results, especially given that there are only 10 people in the group? If just one person in the group were replaced with another who is one year older, that would change the first decimal place from 2 to 3, or slightly more than one month. To claim that we know an average participant’s age to four days’ accuracy does violence to the data.

In fact, that overestimation of the precision of the data pales in comparison to our estimate of the participants’ education. Because the school year is about 200 days long (and they often seemed like very long days), then the last decimal place represents two days in class. Do you really think the data can support this degree of accuracy? I thought not.

If you think that these examples are fairly extreme, then (in the words of TV pitch men), “But wait – there’s more!” I just checked a Web site for the population of Brazil, and the number it reported was 206,769,143. Seriously? Even if that’s based on some equation taking into account the estimated birth and death rates, let’s examine where the numbers came from. There first had to be a census to establish the baseline, and that data-gathering was likely spread out over many weeks or months, covering not only major cities but also remote villages buried deep in the Amazonian forest. During that time, some people were dying and others being born. But let’s not forget the words of Sir Josiah Stamp (1880-1941), a statistician and former Director of the Bank of England: “The government are very keen on amassing statistics. They collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams. But you must never forget that every one of these figures comes in the first instance from the chowy dar [village watchman in India], who just puts down what he damn pleases.” Then, the birth rate is 14.46/1,000 population, or slightly over 340 new souls per hour! (The figure for deaths is about 154/hour.) So, that final “143” in the population estimate is wrong within an hour of being written down. It would be far more “accurate” to say that the population is 206.8 million and leave it at that, indicating that the estimate is really just that – an estimate.

So remember, too much accuracy in a table is inaccurate.

 

Discussion

No comments yet.

Post a Comment

Birth of the Blog

This blog, like the book Lab Math from which it springs (incompletely formed), will be about numbers. I will endeavor to:

1. showcase the basic and the practical, not the challenging or even the advanced;

2.. provide straightforward guidance for the unenthusiastic (“just do it exactly this way”);

3.. provide refreshers for those needing refreshment (whether they know it or not.)