David Streiner, my favorite statistician (if that’s not obvious from other posts here), has been writing short pieces (that I wish I could have written) about important but often misunderstood or forgotten statistical topics. I just read one, about sample size and power, that makes these two uncomfortable topics very clear. In particular, he uses a fabulous analogy to illustrate why sample size affects statistical significance, and I quote:

### “In general terms, the sample size of a study is analogous to the magnification of a microscope. The smaller the object you are studying, the greater the magnification you need. Analogously, the smaller the effect you wish to detect, the larger the sample size you will need.”

Statistics Commentary Series: Commentary #9—Sample Size Made Easy (Power a Bit Less So) JOURNAL OF CLINICAL PSYCHOPHARMACOLOGY · MARCH 2015 · DOI: 10.1097/JCP.0000000000000297 · http://www.researchgate.net/publication/273463222

The reason I like this analogy so much is that magnification is intuitively clear to just about anyone who has ever stood far away from something, then moved in for a closer look. We know perfectly well that what we are looking at isn’t changing, but by changing position so that we can gather more information, we become more sure of what we are seeing. By gathering more data (having a larger sample size), we can be more sure* (have better statistical significance) of what we are seeing.

*Remember, p-values, which are what most people mean when they refer to statistical significance, *only* tell you the probability that you have incorrectly found a difference between two treatments (a false positive), so the words “can be more sure” are on purpose *not* “can know.”

P-values do not “tell you the probability that you have incorrectly found a difference between two treatments”. The P-value is a conditional probability and you have have failed to specify the conditional. If you would like to know the probability that the difference you have found is “correct” you need to evaluate the P-value in the context of the prior and the power. To use your magnification analogy if you are looking for something small and you think you can see it with a microscope, if that something is not actually there it does not matter how much magnification you use, if you think you see what you are looking for you are wrong.

Paul Pharoah raises two important points. First, it is well-known that null hypothesis significance testing (NHST) doesn’t really tell us what we want to know. As Jacob Cohen put it in his delightfully titled article, “The earth is round (p < .05)" (Am Psychol, 1994, 49(12), 997-1003), what we want to know is the probability of the data given that the null hypothesis is true. What we actually get, though, is the probability of the null given the data. Problems with NHST have been highlighted since 1938 (Berkson J. Some difficulties of interpretation encountered in the application of the chi-square test. J Am Stat Assoc. 1938;33:526-542), but the reality is that, unless you are a Bayesian, we've been using it for going on a century, and it has served our purposes.

His other point is that a microscope doesn't help if the phenomenon isn't there. There's no way to argue with this. What this means is that we have to determine the magnification (or sample size) a priori, based on our best guess of the phenomenon's size. Then, if we don't see it, it likely ain't there.