In 1977, Bell labs produced a movie called Powers of 10 (info at IMDB; available on YouTube) that clearly showed just how big big things are and just how small small things are. ”Starting from a view of the entire known universe, the camera gradually zooms in [increasing the magnification by 10 between each image] until we are viewing the subatomic particles on a man’s hand.” (IMDB description, and a fine description it is). It is still a great way to try to get a feel for the scale of things. Jambor’s essay has now alerted me to an interactive site that allows the user to zoom in and out, with much greater resolution, exploring the different scales of the variety of microscopic things we think about so much these days. It is here: http://learn.genetics.utah.edu/content/cells/scale/. One excellent example provided is the size of 12 pt Times regular type (this post is written in 12 pt Times, but remember, you may not be viewing this at life size…). Have fun sliding down, and up, the scales.

Note: Some more information about how to make scale bars can be found in chapter 7 of LabMath, and there is a discussion of scale bars on Research Gate. Nowadays, the software used to take pictures includes an option to add a scale bar. Importantly, though, you must calibrate the software so that it has the correct information for your microscope, and you may need to input the magnification manually, usually using a drop down menu. To confirm that you’ve got it right, take a picture of a ruler then put a scale bar on it; if they match, you’ve got it right.

*Thanks again to Dr. Jambor for contacting me.

]]>Check out this definition of standard (from http://www.oxforddictionaries.com/us/definition/american_english/standard): “An idea or thing used as a measure, norm, or model in comparative evaluations.” ‘Comparative evaluations’ is what I want to emphasize here – when you draw bars indicating the uncertainty in the data you collected, those bars should be comparable to everyone else’s bars. Standard error bars are not comparable and they make your audience have to do extra work to figure out what you found; how annoying! In contrast, standard deviations always mean the exact same thing! How nice for your audience!

The first step of reporting any data set (collection of measurements) is to describe the distribution of your data. To do that, you first make a frequency plot – the x-axis shows the values of your measurements, the y-axis shows the number of times you got each of those values, like in figure 1. Then, you summarize the distribution by saying where the center is and how the measurements are spread out around that center point.

Important aside: Thinking in terms of distributions will help with doing statistical analysis, too. Unless you are using non-parametric statistics, the statistics you will use tell you about distributions, not absolute numbers. As smart as they are, even statisticians cannot predict your data. So in many ways, I advise thinking about the distribution of your data as soon as you possibly can

Figure 1 shows identical frequency plots. Note, though, the scales of the y-axes have been changed to indicate different sample sizes; nevertheless, the distributions of the data points are exactly the same. If the distributions are identical, it follows that the description of the distributions should be identical. And the standard deviations are, indeed, identical: 1.6 and 1.6.

But look what happens to the standard error because of the difference in sample size: 0.3 vs 0.03 is a difference of an order of magnitude, even though the distributions are, you may remember, identical. Standard error is not a comparable evaluation. QED.

Here is a visual that shows what happens when the frequency distribution data are presented in summary form, the kind of figure you are more like to see in a paper:

The data on the right may “look” better, but that kind of spin is frowned upon in science, since your audience will assume it describes the spread of your data, but it does not. The standard error is not standard.

I hope it is pretty clear at this point that the standard error *cannot* be a “standard” way to describe the distribution of your data. Did someone tell you it was OK or traditional to use standard error as long as you say what your sample size was? True, to a point, but is it OK to divide your uncertainty by 10 as long as you say you did it? I recommend going to that person and saying “I’m confused. You said to use the standard error, but this easy to understand article by well respected biostatistician David Streiner (Maintaining standards: differences between the standard deviation and standard error, and when to use each. Can J Psychiatry. 1996 Oct;41(8):498-502; https://www.ncbi.nlm.nih.gov/pubmed/8899234) says that is wrong.” It’s a teachable moment; question authority.

The distribution of your data was what it was – don’t make it look like you are trying to hide something: share it proudly and accurately using the agreed upon standard. You surely worked hard enough to collect it. Also, to repeat myself, the international community of scientists has declared that the standard deviation is the correct way to report uncertainty; so, reporting standard error is like reporting length in cubits instead of meters, and that is just being ornery for no good reason.

So, where does that leave standard error?

There are two kinds of statistics: descriptive and inferential. Above, I’ve been pontificating about descriptive statistics – numbers that describe the distribution of the measures you actually made on your sample. *IF* your data are normally distributed, mean and standard deviation are useful summaries of what you found, because they are standard so just two numbers will give your audience an interpretable summary of your data.

Inferential statistics let you make inferences about the population from which the sample came. I think it is fairly intuitive that if you measure many more individuals (that is, your sample size is bigger), your estimate of the distribution of the entire population will get better and better. One way to think about this is to look at the extremes: if your sample size is 0, you will make an absolutely terrible estimate of the mean of the population. If your sample size equals the size of the population, your estimate will be perfect. In between, the bigger your sample size, the closer to perfection you get with your estimate of the whole population. Thus, it is when you are calculating inferential statistics that you should take into account the sample size.

One useful statistic to report when discussing your inferences about the population is the confidence interval. It tells your audience the range within which you believe the mean of the population would be found. As always with statistics (“statistics means never having to say you are sure”), you also tell your audience the degree of confidence you have in those intervals. To calculate confidence intervals, divide your standard deviation by the square root of the sample size then multiply that quotient by 1.96, if you want to indicate that you are 95% confident, or 2.58 if you are 99% confident. That quotient, for some reason, got a name: the standard error. In other words, standard error is just a rest stop on the road trip towards confidence intervals: you might be tempted to stop in for little chocolate donuts and coffee, but you really don’t want to linger there or brag about having been there. Just keep moving towards your goal.

I will end with a rule of thumb for interpreting graphs that (annoyingly) show standard error instead of standard deviation: in your head, double them, and that will give you a reasonable estimate of the 95% confidence intervals, although it will still leave you unclear about the data the authors collected. YOU will never make your readers do that, right?

]]>

It is called the Wason* 2-4-6 Task, (I’ve seen it referred to as the 2-4-8 test). It is the best exercise I’ve ever seen for demonstrating the perils of confirmation bias. It also stimulates great conversations about the importance of controls, the careful examination of assumptions, the importance of negative results, and, the biggie, how critical it is to attempt to DISPROVE your hypotheses, not prove them. When I’ve done it with colleagues as well as students, it has also stimulated discussions about experimental design, and different kinds of creativity, and how having multiple hypotheses can help prevent falling dangerously in love with one.

There are many versions on the web; I like this site:

https://explorable.com/confirmation-bias

It has a very nice explanation and a charming video. If you can, stop it before he gives the answer (at 2’55″) – see if you can guess the rule.

I cannot recommend this exercise more highly. I do it with every new student that crosses my path, as well as friends and family (I am such a nerd). Everyone, without exception, thinks it is a fun and intriguing experience. And forevermore, you can help students realize when they are thinking in a biased way just by saying “2-4-6” so it also provides a handy tool for reinforcing the ideas.

Go forth and joyously spread the news of the Wason 2-4-6 Task!

*Peter Cathcart Wason, 1923-2003. Among many achievements, he coined the term “confirmation bias”.

]]>I think there might be an error in the equation for converting RCF to rpm on page 140 of the second edition, hardcover.

Should the equation be:

*rpm*= (*RCF / (r* x 1.118 x 10^-**6**))^1/2

instead of 10^-5?

because the radius is measured in mm?

…

E. D.

Dear E. D.

Thank you for pointing out the issue. You are correct. The difference has to do with the units of radius.

If you look around, you will find that there is no convention for whether to report the radius of the rotor in mm or cm. Unfortunately, I didn’t make it clear that there are two versions in common use, and that they are both in the book specifically to show that. On page 139, it is written out correctly for mm, and it states explicitly that I mean radius in mm. On page 140, I switched to cm, with only a parenthetical comment that I had done that. I really should make that more obvious. When using cm, the exponent is –5, when using mm the exponent is –6.

One way to think about it is to imagine measuring the rotor in mm, now imagine measuring the exact same rotor in cm. The second measurement is going to be the first measurement divided by 10. But the RCF hasn’t changed. To take that “divide by 10” into account, therefore, you need to multiply the answer by 10, or you won’t get the same RCF. That “multiply by 10” gets folded into the constant so the exponent becomes 10^-5.

]]>

No, your eyes are not deceiving you; the title of the blog has changed slightly, from “How to Make Truly Terrible Graphs” to “How to Make Truly Terrible Tables.” This reflects the fact that it is possible to screw up (am I allowed to say that? Let’s make it “have things go amiss”) in areas other than graph-making. So, in the next few blogs, we’ll turn our attention to making tables for papers and presentations. (As a woodworker, I’ve screwed up making other types of tables, but that discussion will have to wait for a different forum.) The second part of the title may also raise some eyebrows; how can you be too accurate? After all, the need for accuracy has been drummed into our heads since we were scientists-in-training, learning the rules of the game at our supervisor’s knee. Whether we’re using an extremely expensive piece of lab equipment or designing a new paper-and-pencil scale, the mantra is the same: reduce the error in order to improve the reliability of our measurements and increase the accuracy. So in presenting our results in a table, how can we be “too accurate?”

As a matter of fact, it’s actually quite easy; all we have to do is ignore the imprecision inherent in any measurement and just keep printing out all of those numbers to the right of the decimal point. For starters, let’s take a look at Table 1, presenting some basic demographic information for a group in a study.

Table 1

Demographic Information

Variable |
Group 1 |
Group 2 |

Number of males/females |
6/4 |
5/5 |

Age in Years (SD) |
38.25 (10.05) |
37.60 (9.90) |

Education in Years (SD) |
13.45 (4.20) |
12.90 (4.15) |

Starting off with Age, we report that it’s 38.25 years for the 10 people in Group 1. If we determined age by asking the people how old they were at their last birthday, then on average, there’ll be an error of about 180 days. For example, at my last birthday, I was 73 years old, but I’m actually 73 years, 8 months, and 12 days old on the day that I’m writing this. (For those who want to send cards or presents, my actual birth date is 12 November; my mailing address is available on request.) We can improve the accuracy by asking people how old they are as of their nearest birthday, but that decreases the error to “only” 90 days, on average. Now, just what does that ‘5’ in the second decimal place represent? It’s 1/100^{th} of a year, or 3.65 days. Given the degree of inaccuracy in how we measured age to begin with, can we really justify this degree of accuracy in reporting the results, especially given that there are only 10 people in the group? If just one person in the group were replaced with another who is one year older, that would change the *first* decimal place from 2 to 3, or slightly more than one month. To claim that we know an average participant’s age to four days’ accuracy does violence to the data.

In fact, that overestimation of the precision of the data pales in comparison to our estimate of the participants’ education. Because the school year is about 200 days long (and they often seemed like very long days), then the last decimal place represents two days in class. Do you really think the data can support this degree of accuracy? I thought not.

If you think that these examples are fairly extreme, then (in the words of TV pitch men), “But wait – there’s more!” I just checked a Web site for the population of Brazil, and the number it reported was 206,769,143. Seriously? Even if that’s based on some equation taking into account the estimated birth and death rates, let’s examine where the numbers came from. There first had to be a census to establish the baseline, and that data-gathering was likely spread out over many weeks or months, covering not only major cities but also remote villages buried deep in the Amazonian forest. During that time, some people were dying and others being born. But let’s not forget the words of Sir Josiah Stamp (1880-1941), a statistician and former Director of the Bank of England: “The government are very keen on amassing statistics. They collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams. But you must never forget that every one of these figures comes in the first instance from the *chowy dar* [village watchman in India], who just puts down what he damn pleases.” Then, the birth rate is 14.46/1,000 population, or slightly over 340 new souls *per hour*! (The figure for deaths is about 154/hour.) So, that final “143” in the population estimate is wrong within an hour of being written down. It would be far more “accurate” to say that the population is 206.8 million and leave it at that, indicating that the estimate is really just that – an estimate.

So remember, too much accuracy in a table is inaccurate.

]]>

Statistics Commentary Series: Commentary #9—Sample Size Made Easy (Power a Bit Less So) JOURNAL OF CLINICAL PSYCHOPHARMACOLOGY · MARCH 2015 · DOI: 10.1097/JCP.0000000000000297 · http://www.researchgate.net/publication/273463222

The reason I like this analogy so much is that magnification is intuitively clear to just about anyone who has ever stood far away from something, then moved in for a closer look. We know perfectly well that what we are looking at isn’t changing, but by changing position so that we can gather more information, we become more sure of what we are seeing. By gathering more data (having a larger sample size), we can be more sure* (have better statistical significance) of what we are seeing.

*Remember, p-values, which are what most people mean when they refer to statistical significance, *only* tell you the probability that you have incorrectly found a difference between two treatments (a false positive), so the words “can be more sure” are on purpose *not* “can know.”

]]>

Analysis of Patriot’s pressures incorrect

(Thanks to Marc Abrahams for bringing this to my attention)

]]>

My lab has a new centrifuge that I recently needed to use for the first time. Like most centrifuges, you can set the rotations per minute (RPM) and the number of minutes; my protocol said ‘spin at 125 g for 6 minutes.’ Having used many centrifuges, and having written (in Lab Math) about the indefensible* conversion from RPM to g, I expected this. However, when I went to look at the conversion chart that I expected to find taped to the lid of the centrifuge, it wasn’t there. No one had made a spreadsheet to calculate the conversion from g’s to RPMs for frequently used values. No one had measured or looked up the radius of the rotor, or indicated whether it represented the distance to the middle or the tip of the holders. At least, no one had thoughtfully posted it in an obvious place for those who were to follow. So I went in search of someone who I knew had used this centrifuge to find out if this information was kept somewhere that I didn’t know about. The person I found to ask was trained in a lab at Yale. He was told, during his training, that 1.0 RPM (the numbers have been changed to protect the innocent) would give him the correct RCF, and since our centrifuge is about the same size as the one in the Yale lab, he just uses 1.0 RPM. Always.

SERIOUSLY? The equation is: RCF = RPM^{2} [min^{-2}] x Radius [mm] x 10^{-6}

It is multiplication! Granted, getting out a ruler and measuring from the center of the rotor to the tip of the holder can be physically exhausting and is best left to the young athletic types in your lab. No argument there. But *risk your experiments rather than do multiplication?* And he learned this at Yale? This is a very smart person, a very good scientist, yet the thought of doing multiplication is so distasteful, that he relies on a number he once heard from someone he considered trustworthy.

What is this ‘culture of equation avoidance’ doing to our scientists? He may someday find himself with a new centrifuge, of a different size, and his 1.0 RPM could lead him to bad data. Troubleshooting will be close to impossible, and he will abandon his beautiful experiment and not get a grant.

Please help him, and scientists like him. Change the culture. Use equations until it hurts.

* I object to the use of “g” as unit for this purpose, although I appreciate that thinking in terms of our constant companion gravity is a comfort. The correct name for the parameter in question is Relative Centrifugal Field or RCF. Without regard for reality, however, RCF is traditionally reported in units of g. RCF has dimensions of length over time squared (L T^{-2}), which is mm/minutes squared in the above equation (rotation is dimensionless). RCF is determined entirely by the rotations per minute and the radius of the rotor. On the other hand, gravitational force has units of, surprise, force, i.e. Newtons, meaning its dimensions are mass x length / time squared (M L T^{-2}). What happens to the mass when you convert to RCF? Traditionally, they’re not telling. So, writing “an RCF of 125 g” is an abomination. However, I have gotten over this and moved on. Really.

In previous blogs, I described how to make terrible graphs using some of the features of leading graphing packages, such as pie charts and 3-D graphs. But, this is unfair to users of other programs that do not offer these “enhancements” (yes, such programs do exist; in fact, I use them exclusively except when I’m preparing talks for hospital and university administrators). “How,” I hear them cry, “can we too make truly terrible graphs?” Well, do not despair; help is at hand. In this blog, I will discuss a very easy way to turn a straightforward graph into a disaster.

The vast majority of graphs have two axes – the X-axis (abscissa) along the bottom and the Y-axis (ordinate) running along the left side. There can be variants of this, such as having a secondary Y-axis on the right, or having the Y-axis cross the X-axis in the middle, but these won’t change the basic message. Also in most cases, the Y-axis starts at zero and runs up (or down, in some cases) to the maximum. Simple as this seems, it leaves a lot of room for mischief. The best way to thoroughly distort what the data show is to have a “floating Y” – starting the axis at some point other than the natural base, which in most cases is zero. For example, let’s assume that the university’s president is trying to justify his request for an (obscenely high) increase to his (already obscenely high) salary because his workload has gotten so much heavier over the past few years. To bolster his case, he presents the following graph to the board of governors:

Wow! Look at the increase. Of course we have to reward him (although we could ask why he’s still working a shorter week than mere mortals). But wait a second – the Y-axis doesn’t start at zero; it’s floating up there with a minimum of 30. What would the graph look like if it did start at zero?

That’s more like it and just as we suspected; that “increase” is barely perceptible without a microscope. By shrinking the range of the Y-axis, small differences are magnified.

You may object to this graph on esthetic grounds, that most of the graph – the area below 30 – is blank, and why waste space showing nothing? That’s a valid point. There are times when it doesn’t make sense to start at zero. In these instances, the honest thing to do is at least alert the reader to that fact by making a break on the axis, like this:

Note that we’ve made a bit of a compromise; there’s less empty real estate, but the increase appears a bit more extreme than it actually is. We’ll discuss in a bit how to determine if there’s too much of a distortion.

Lest you think that exaggerating differences by having a floating Y-axis is restricted to unscrupulous administrators (if that isn’t a redundancy), here’s a graph taken from an article purportedly showing that the risk of suicide is reduced by attending religious services (Kleiman & Liu, 2014).

For those of you who are unfamiliar with survival analysis, the left axis, “Survival function,” shows the odds of being alive after a given time for the two groups.

Again, the first reaction is Wow! Maybe we should all think of attending services a couple of times a week, if not every day, and that’ll really reduce our risk. But let’s take a closer look at the Y-axis. The bottom is not at zero, but at 0.9990. In other words, the entire range is 0.001 rather than 1.0. That “difference” between the groups is actually 0.9998 versus 0.9992 over an 18 year span. I tried plotting it with a true zero, and the lines were perfectly flat and superimposed on one another, as was the case with starting it at 0.80 and 0.90. In fact, I couldn’t see any light between them until I did:

Even here note that the axis extends only from 0.98 to 1.00. Kinda sorta makes you want to reconsider how you spend your weekends, at least insofar as preventing suicides is concerned.

So, how can you tell if a graph is misrepresenting what’s really going on? You can use the Graph Distortion Index (GDI) proposed by Beattie and Jones (1992). It’s defined as:

In the first graph, the president’s change in time looks like a 350% increase (from 20% up the Y-axis to 90% up the axis), whereas the actual increase is 15.6%. So, plugging those numbers into the equation we get (350/15.6) – 1 = 21.4, which is more than a bit higher than the recommended maximum of 0.05.

Remember what we said in an earlier blog: the main purpose of a graph is not to present numbers, but to allow the viewer to get an immediate visual impression of what’s going on. So don’t despair; even if you can’t make pie charts or 3-D graphs, you can still really distort the data by using a floating Y-axis.

References

Beattie, V., & Jones, M. (1992). The use and abuse of graphs in annual reports: Theoretical framework and empirical study. *Accounting and Business Research, 22*, 291–303.

Kleiman, E. M., & Liu, R. T. (2014). Prospective prediction of suicide in a nationally representative sample: Religious service attendance as a protective factor. *British Journal of Psychiatry, 204*, 262-266.

]]>