Response to ‘When biostatistics is a neo‐inductionist barrier to science’

0
636

We can understand some of the sentiments expressed by Dr Ashton and agree that classical statistics considered in the early perspectives in the series are indeed inappropriate for some laboratory experiments. Our defence is that we addressed first the topics we believed were known to our readers, those that are still widely used and misused by scientists. Later perspectives introduce more relevant materials. We do not carry a particular torch for Fisher and his post-mortem, but we quoted him to make a specific point: one can usually tell, in retrospect, why the desired result cannot be ‘obtained’ from the data, and this is often because the experiment has been badly designed. In the case of good (or appropriate) design, as with beauty, appreciation is in the eye of the beholder. In statistics, as in art and science, opinions differ, often widely (Fisher certainly disagreed with many other statisticians). Most statisticians can recount horror stories of being presented with data from experiments that could have been resuscitated if therapy had been administered sooner. Far from being purveyors of confidence born of hindsight, practising statisticians have experience of past errors and triumphs and can see problems looming before they happen: just as can good scientists or doctors, most of the time. They can also see if a proposed study is futile, particularly if they are also versed in the science. However, these perspectives (Drummond and Vowler, 2011a,b, 2012; Drummond and Tom, 2011a,b, 2012) have not been written by statisticians to attack scientists. One of the authors is an amateur physiologist and an even more amateur statistician, but has picked up his pen because even to him, it is evident that many scientists misuse statistics. We argue, in a later perspective, that ‘the vast majority of lab studies are not based on random samples from populations’ (Drummond and Vowler, 2012), and we introduce more recent and appropriate means of analysis for such studies. Old methods and habits die hard: the T-test may not be best. The same author, an anaesthetist, found a physiology group using gallamine as a muscle relaxant, despite its evident disadvantages for the study in progress ‘because that’s what we always use’. He did not ask about the statistics but suspected that he would have got the same answer. Seeking expert advice does no harm. Indeed, an expert collaboration between a statistician and a scientist is a ‘dream team’ if it is based on trust and respect (Berry, 2012). We doubt that analogy is the best form of argument, neither the failed experiment nor the engineering analogy. Engineers are schooled in the principles involved in putting together a structure. The principle ‘simplify and add lightness’ attributed to Colin Chapman, famous designer of the Lotus sports cars, indicated his strong grasp of theory. Both a ruined project and a crashed car suggest poorly applied principles, or perhaps pushing the limits where the principles no longer hold. Some said that Lotus cars should be able to stick to the ceiling if they got up to speed! The engineer and the scientist are trained to think in terms of the relevant principles, and the associated physical laws. Not many engineers, as far as we understand, build aircraft that do not stay in the sky. If they did, we suspect that they would not ask a statistician why. Both theory and experiment are needed: the almost entirely theoretical airfoil developed by British mathematical theory proved substantially inferior to the practically tested ones used in German aircraft during the First World War (Anderson, 1997). British pilots had to be substantially more skilful to stay in the air. Contrary to Dr Ashton’s suggestions, most statisticians are firmly rooted in the real world and know that data are often dirty, messy and unreliable. Noise, iteration, error and unpredictability: these are the statistician’s stock in trade: far removed from the ‘inflexible inductionist schema’, they are accused of adopting. How else would modern statisticians be using expressions such as Monte Carlo, bootstrap, the coupon collector’s test and censoring? Modern statistical theory is more than equal to dealing with modern problems, and the agenda is set by the problems presented. Medawar objected to the moulding of a paper to the ‘story’ of a standard hypothesis-experiment-conclusion sequence. Few of us would care to plough through an account that says ‘first we had this idea, and that didn’t work, so we tried that, and that sort of worked, but we didn’t take enough samples, so we did a bit more. …’ The account is sanitized to ‘We hypothesized this, did that, and the conclusion then is often “these results are highly unlikely if our samples had been taken from the same population.”’ This is often a gross simplification, not necessarily truthful, and sometimes misleading, but it is a formula that most people have so far agreed to accept, although ‘effect’ is creeping in as a useful concept. Bolstered by old-fashioned and perhaps inappropriate statistical analysis, we agree the classic paper’s structure becomes even more of a travesty. It can be even more galling if these are truly the views of funding agencies. Our experience is otherwise; funding agencies are usually pragmatic and often well informed in their statistical judgement, perhaps more than most journals. To quote a very recent statistician’s comment: ‘A reliance on basic biostatistical methods by many scientific researchers translates into substandard approaches to data analysis still appearing in many published studies. A tendency for researchers to then reapply these methods to their own data without first checking their validity helps perpetuate the errors’ (Woodman, 2012). In summary, statisticians are on our side, or better still statisticians and scientists are on the same side, but we can all do to improve, and collaboration will help.