Lab Math » Numerical Op-Ed

A Mathematical Model Makes Predictions. That’s it.

admin_tr — Wed, 04 Apr 2018 17:29:06 +0000

It bugs me no end when I hear “the model shows…” In science evidence is the only thing that “shows” anything. The only thing worse is “the model proves…” That’s like asking for directions then turning to your passenger and saying “that proves it: the only way to get from here to there is that way.”

I first heard of Solomon Wolf Golomb (1932- 2016) a few years ago. He was a mathematician, engineer, and professor of electrical engineering at the University of Southern California. He invented polyominoes, the game that inspired Tetris, among other things. Here are four quotes from Dr. Golomb that I think are very useful when thinking about models, mathematical and otherwise.

Don’t apply a model until you understand the simplifying assumptions on which it is based and can test their applicability.

The purpose of notation and terminology should be to enhance insight and facilitate computation – not to impress or confuse the uninitiated.

Don’t expect that by having named a demon you have destroyed him.

Distinguish at all times between the model and the real world. You will never strike oil by drilling through the map!

Image from: https://mathmunch.org/2016/05/05/solomon-golomb-rulers-and-52-master-pieces/

Check out the man himself at:

https://youtu.be/DZ24iQ26mis

Teach about confirmation bias. Please.

Dany Adams — Mon, 12 Sep 2016 19:42:22 +0000

I am on a mission to spread the news of this transformational (yes!) exercise for all teachers of all subjects, but especially scientists.

It is called the Wason* 2-4-6 Task, (I’ve seen it referred to as the 2-4-8 test). It is the best exercise I’ve ever seen for demonstrating the perils of confirmation bias. It also stimulates great conversations about the importance of controls, the careful examination of assumptions, the importance of negative results, and, the biggie, how critical it is to attempt to DISPROVE your hypotheses, not prove them. When I’ve done it with colleagues as well as students, it has also stimulated discussions about experimental design, and different kinds of creativity, and how having multiple hypotheses can help prevent falling dangerously in love with one.

There are many versions on the web; I like this site:

https://explorable.com/confirmation-bias

It has a very nice explanation and a charming video. If you can, stop it before he gives the answer (at 2’55″) – see if you can guess the rule.

I cannot recommend this exercise more highly. I do it with every new student that crosses my path, as well as friends and family (I am such a nerd). Everyone, without exception, thinks it is a fun and intriguing experience. And forevermore, you can help students realize when they are thinking in a biased way just by saying “2-4-6” so it also provides a handy tool for reinforcing the ideas.

Go forth and joyously spread the news of the Wason 2-4-6 Task!

*Peter Cathcart Wason, 1923-2003. Among many achievements, he coined the term “confirmation bias”.

The Patriots, the Nobel Laureate, and the power of uncertainty

Dany Adams — Thu, 14 May 2015 20:07:16 +0000

Chemistry Nobel Laureate Roderick MacKinnon has done a wonderful (by which I mean numerically sound) analysis of the analysis of the Patriots’ footballs. This is yet another example of the cost of not understanding uncertainty: was it $2 million? If Brady would like me to teach him, I’ll take a mere half of that.

Analysis of Patriot’s pressures incorrect

(Thanks to Marc Abrahams for bringing this to my attention)

Effect and substance, not p

Dany Adams — Fri, 05 Dec 2014 21:58:55 +0000

There is an excellent resource by Paul Ellis at:

http://effectsizefaq.com

where I just clicked on

http://effectsizefaq.com/2010/05/30/how-do-researchers-confuse-statistical-with-substantive-significance/

This page talks about the problem with p-level being the be-all and end-all of way too many scientific studies. You might be aware that there is discussion about this in the scientific literature. (I think statisticians deserve prizes for still trying to get the rest of us to pay attention.) The problem is that the ONLY thing p-level tells you is the probability that rejecting the null hypothesis is the wrong thing to do. That’s why small is good: you want the probability of a false positive to be as small as possible. And that is all it does. Nothing else. Take a moment, if you will, to consider all the other ways you could be wrong: false negative, wrong question, wrong control, etc. Most importantly, it does not tell you if your result is significant in any scientifically meaningful way.

This is what Paul Ellis is talking about when he uses the vocabulary “substantive significance” and the link above goes right to the heart of the matter: researchers are confusing statistical significance with substantive significance, and journals are letting them get away with it. In my ideal world every result comes with, at least, the standard deviation, and the four things noted below: alpha, beta, sample size, and effect size, that last being accompanied by some sentences describing why that effect size was chosen.

You may be lucky enough to have as big a sample size as you want, but you still must use your brain to decide what matters, to design an experiment that actually answers your question, and to do appropriate controls so that you can make interesting comparisons. A large sample size may allow you to find very small differences, but if the differences are that small, do they matter? They very well might, but you must think that through, no statistic can do that for you.

There is, however, a numerical representation of that “minimum important difference” called the effect size. The best way to plan an experiment is to decide in advance on the effect size that you, with your brain, think is important enough to be worth detecting, and decide in advance how low you want the probability of a false positive to be (alpha, or, p-level), and decide how low you want the probability of a false negative to be (beta), and decide on the most information-packed way to measure, which includes deciding on the appropriate statistical test to use, THEN calculate the sample size and stick to it.

To learn more about effect size, go to effectsizefaq.com and read all about it.

Graphing advice

Dany Adams — Fri, 26 Sep 2014 14:01:48 +0000

How to Make Truly Terrible Graphs: A Tutorial

David L. Streiner, special guest contributor and co-author of excellent statistics texts

Part 1 – Introduction

In 1968, when I was writing up my doctoral thesis, I needed to make some graphs showing how the different groups changed over time under various conditions. There were no computer programs to draw graphs (indeed, there were no such things as desk-top computers back then), so I had to draw the lines by hand, using special pens and ink, and the symbols and letters were added by rubbing them off special sheets of transfer paper. It took an entire day or more to make a single graph, and few people had the ability to do them (I had the advantage of training in engineering, and having spent five summers working as a draftsman). Consequently, there were relatively few graphs in journals, and those which did appear were simple black and white line charts or bar graphs. Researchers had very little ability to screw things up.

Nowadays, every computer comes equipped with at least one, and often two, graphing packages, and they allow the user to add a host of special effects – being able to make the graphs look three-dimensional, to have pie charts with segments highlighted by separating them from the rest of the pie, or to use bars of different shapes and colors. Even more options are available if the graphs will be used during a live presentation: you can use many different fonts and colors; text can fly in and out from any direction; and you can add logos from your university, your research unit, and the funding agency at the bottom of every slide. This is in addition to pictures of leaves or keys or some other totally irrelevant (but cutesy) graphic running down the left side. In other words, users are able to screw up graphs in ways that were previously unimaginable.

Unfortunately, few people know how to take full advantage of these features in order to draw truly terrible graphs. Over the next few months, I hope to remedy this parlous situation in this blog and teach you how you, too, can make graphs as bad as those that grace the pages of many daily newspapers and popular magazines. As an added bonus, I will also show you how to make tables that are unnecessarily dense, obscure, and confusing.

The first lesson in bad graphing, and the focus of this blog, is to fail to differentiate between the purpose of a graph and that of a table. Take a look at the graph below. It shows the expenditure per acute hospital bed in seven regions of a province in Canada. Now imagine you’re sitting in a darkened auditorium and this is on the screen for 30 seconds. So look at it for a while and then close your eyes. Now tell me: What was the average for the entire province? What was the expenditure in region C? Which region had the highest expenditure? (Actually, if you can read these questions, you’re cheating, because your eyes must have been open to do so.)

I’m willing to bet that if you didn’t peek, you’d have trouble answering the first two questions, but may be able to answer the third. It’s simply impossible to remember all those numbers, except perhaps that they’re somewhere in the range of $60,000 to $80,000, and a lot easier to pick up the fact that Region B is the highest. This illustrates the major difference between a table and a graph – the former is better for presenting numbers and the latter for showing relationships. You may be able to get away with a graph such as this one on the printed page, where the reader has the luxury of staring at it as long as he or she wants, but it would be a disaster if it were shown during a talk. There’s just too much information for the viewer to absorb in just 30 seconds or so. If you want the audience to come away with the message that there are large differences among the regions (and that’s all they will remember one hour later), then kill the numbers.

In fact (and jumping ahead a bit), we can make the message even stronger. Because Region is a nominal variable, the order doesn’t matter, so let’s make the audience’s task easier by rank ordering the regions, and we get a graph like this one:

Now the message comes through loud and clear – there are large differences among the regions, where B is the clear winner and E gets shafted. So the take-home messages are: (1) be clear what you want to communicate, (2) use tables to show numbers, (3) graphs should be used to show relationships, and (4) do everything possible to make it easy for the audience.

Kill My Book

Dany Adams — Mon, 25 Aug 2014 20:49:44 +0000

Flashy new techniques get a lot of press, sometimes deservedly so: technical breakthroughs often lead to breakthroughs in understanding as well. But in the struggle for game-changing insights and the fame (funding) they bring, the tried, and more importantly true, gets lost in the shuffle. The person who has to teach the intro class is pitied, and it is considered mind numbing or remedial to cover the fundamentals. It is an unchallenged truth that science writing will be terrible. In the more expensive schools undergraduates are using the PCR machine but they don’t know how to calibrate the pH meter. Graduate students are learning to program mathematical modeling software but need an online program to convert units. The fact that Lab Math fills a genuine need is great for me personally, but a sad comment on something. High school science education? Parents confusing “best” with “newest”? Software that claims to perform critical analysis?

I happen to think that advertising is the root of all evil[1]. The word “smart” now applies to phones; need I say more? To sell a product requires convincing a buyer that this product offers something that product does not and you need that thneed[2]. Whether that something is useful or good is rarely discussed and certainly not by the salespeople. Plus we all like shiny new things. New math anyone?

Most of the fundamentals, like multiplying fractions, using a pipet, reading a graduated cylinder, and matching your predicate to your subject, are forgotten in our excitement over ANOVAs and digital qPCR machines and telling everyone what we did. Perhaps it should be reasonable to assume that students got those fundamentals in high school, grade school, or utero. Unfortunately we make that assumption at the peril of our experiments. The wrong pH can really mess you up and it will be almost impossible to discover what went wrong or, worse, that something did go wrong. Even if your students took the classes and aced the tests, it is likely that the skills were forgotten, or deemed useless, before your student realized that Science was the best career on the planet. Plus the students, especially the A students, either don’t know they don’t know[3], or won’t admit they don’t know. Students are often embarrassed to ask how to use the tools, or they assume they don’t need to ask. So everyone thinks they know how to pipet and how to write, and here we are, publishing p-levels while leaving out the sample size, the effect size, and the power, and reviewing manuscripts that take forever to read because the writer’s meaning is so well hidden. The public doesn’t have a chance and science writing is now a specialty that, while it is well written, often garbles some of the facts, or misses the important ones. And I know good scientists who don’t know that an outlier is not just something that looks different, there is an actual calculation involved[4].

My suggestion is that everyone, regardless of whether they believe it to be true, announce: “I do not write, or measure, or calculate as well as I could.” Spend time with your students actually reading the manuals of your tools before using them – even pipettes have directions. Have a journal club in which you read The Science of Scientific Writing by Gopen & Swan[5] and Strong Inference by Roger Platt[6]. Work through Biostatistics: the Bare Essentials or PDQ Statistics by Norman and Streiner[7]. That knowledge is the foundation you must have, and maintain, so that you can build a new paradigm with your creativity and your novel insights. And the students will really appreciate being taught stuff without having to ask. Do this for the whole field: work on your writing and your arithmetic skills. Help put Lab Math on the remaindered list.

[1] The irony of that sentence appearing on a blog that is, at least in part, advertising for my book, has not escaped my notice.

[2] See: Geisel, Theodor Seuss (1971) The Lorax. New York. Random House

[3] See: Kruger J. and Dunning D. (1999) Unskilled and unaware of it: how difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Pers. Soc. Psychol. 77(6):1121-34.

[4] I will take this opportunity to say how delighted I am to see box plots again!

[5] Gopen G. and Swann, J. (1990) The Science of Scientific Writing. American Scientist. https://www.americanscientist.org/issues/pub/the-science-of-scientific-writing

[6] Platt, J.R. (1964) Strong Inference. Science. 146:3642. http://pages.cs.wisc.edu/~markhill/science64_strong_inference.pdf

[7] Norman, G.R. & Streiner, D.L. (2014) Biostatistics: The Bare Essentials 4th ed. Hamilton, Ontario, Canada. B.C. Decker Inc. –OR– Norman, G.R. & Streiner, D.L. (2003) PDQ Statistics, 3rd ed. Hamilton, Ontario, Canada. B.C. Decker Inc.

Edit this entry.

Bayes explained very nicely

Dany Adams — Tue, 21 Jan 2014 19:18:11 +0000

http://meandering-through-mathematics.blogspot.com/2011/05/bayesian-theory.html

I found this link to be a very helpful description of Bayes’ theorem.

Great Stats Blog Site

Dany Adams — Thu, 16 Jan 2014 18:57:25 +0000

A recent post on the Simply Statistics blog takes on a sort-of-hot topic in statistics: what errors actually matter, and how are they best quantified and reported when you are using statistics to infer something about a population. Best, in this case, means best at making accurate predictions. The two camps are the Frequentists and the Bayesians. (I gather from reading a bit that the debate had actually settled down until Nate Silver brought it up in his book The Signal and the Noise.) Note – the disagreement is not about descriptive statistics, it is about inferential statistics, so don’t worry if you are committed to box plots, frequency distributions, and/or mean and standard deviation; they are very good for describing data. The two approaches differ in what you are comparing your results to. All interpretations are comparisons, implicit or explicitly, so what you compare your results to matters. In one camp, you have people comparing their measurements to the null hypothesis which is that variation among your measurements arose due to random, natural variation in the measurand (the thing that is measured). The other camp includes in its comparisons previous measurements of the measurand. It does this by including consideration of what they call “priors.” My “I’m-not-a-statistician-but-I-know-what-I-like” point of view on this is that each is good for particular things, which is why both approaches continue to be used. For example, if you can not find a relevant prior, you can’t take it into account, so you have to compare your result to the null hypothesis. As with many things, for example choosing effect size, your educated judgment has to inform the design of your analysis and thus the design of your experiments.