I Think You'll Find It's a Bit More Complicated Than That (16 page)

Read I Think You'll Find It's a Bit More Complicated Than That Online

Authors: Ben Goldacre

BOOK: I Think You'll Find It's a Bit More Complicated Than That

12.58Mb size Format: txt, pdf, ePub

Guardian
, 4 December 2003

What is it about magnets that amazes the pseudoscientists so much? The good magnetic energy of my Magneto-Tex blanket will cure my back pain; but I need a Q-Link pendant to protect me from the bad magnetism created by household devices. Reader Bill Bingham (oddly enough, the guy who used to read the Shipping Forecast) sends in news of the exciting new Wine Magnet: ‘Let your wine “age” several years in only 45 minutes! Place the bottle in the Wine Magnet! The Wine Magnet then creates a strong magnetic field that goes to the heart of your wine and naturally softens the bitter taste of tannins in “young” wines.’

I was previously unaware of the magnetic properties of wine, but this explains why I tend to become aligned with the earth’s magnetic field after drinking more than two bottles. The general theory on wine maturation – and it warms the cockles of my heart to know there are people out there studying this – is that it’s all about the polymerisation of tannins, which could conceivably be accelerated if they were all concentrated in local pockets: although surely not in forty-five minutes.

But this exciting new technology seems to be so potent – or perhaps unpatentable – that it is being flogged by at least half a dozen different companies. Cellarnot, marketing the almost identical ‘Perfect Sommelier’, even has personal testimonies from ‘Susan’ who works for the Pentagon, ‘Maggie, Editor,
Vogue
’, and a science professor, who did not want to be named but who, after giving a few glasses to some friends, exclaimed, ‘The experiment definitely showed that the TPS is everything that it claims to be.’ He’s no philosopher of science. But perhaps all of these magnetic products will turn out to be interchangeable. Maybe I can even save myself some cash, and wear my MagneForce magnetic insoles (‘increases circulation; reduces foot, leg and back fatigue’) to improve the wine after I’ve drunk it.

And most strangely of all, none of these companies seems to be boasting about having done the one simple study necessary to test their wine magnets. As always, if any of them want advice on how to do the stats on a simple double-blind randomised trial (which could, after all, be done pretty robustly in one evening with fifty people) – and if they can’t find a seventeen-year-old science student to hold their hand – I am at their disposal.

What Is Science?
First, Magnetise Your Wine …

Guardian
, 3 December 2005

People often ask me (pulls pensively on pipe), ‘What is science?’ And I reply thusly: Science is exactly what we do in this column. We take a claim, and we pull it apart to extract a clear scientific hypothesis, like ‘Homeopathy makes people better faster than placebo,’ or ‘The Chemsol lab correctly identifies MRSA’; then we examine the experimental evidence for that hypothesis; and lastly, if there is no evidence, we devise new experiments. Science.

Back in December 2003, as part of our Bad Science Christmas Gift series, we discovered The Perfect Sommelier, an expensive wine-conditioning device available in all good department stores. In fact there are lots of devices like this for sale, including the ubiquitous Wine Magnet: ‘Let your wine “age” several years in only 45 minutes! Place the bottle in the Wine Magnet! The Wine Magnet then creates a strong magnetic field that goes to the heart of your wine and naturally softens the bitter taste of tannins in “young” wines.’

At the time, I mentioned how easy it would be to devise an experiment to test whether people could tell the difference between magnetised and untreated wine. I also noted how strange it was that none of these devices’ manufacturers seemed to have bothered, since it could be done in an evening with fifty people.

Now Dr James Rubin et al. of the Mobile Phones Research Unit at King’s College London, have published that very study, in the esteemed
Journal of Wine Research
. They note the dearth of experimental research (quoting, chuffingly, the Bad Science column), and go on: ‘One retailer states, “We challenge you to try it yourself – you won’t believe the difference it can make.”’

Unwise words.

‘A review of Medline, PsychInfo, Cinahl, Embase, Amed and the Web of Science using the search term “wine and magnet” suggested that, as yet, no scientists have taken up this challenge.’

Now, this study was an extremely professional operation. Before starting, they did a power calculation: this is to decide how big your sample size needs to be, to be reasonably sure you don’t miss a true positive finding by not having enough subjects to detect a small difference. Since the manufacturers’ claims are dramatic, this came out at only fifty subjects.

Then they recruited their subjects, using wine. This wine had been magnetised, or not, by a third party, and the experimenters were blind to which wine was which. The subjects were also unaware of whether the wine they were tasting, which cost £2.99 a bottle, was magnetised or not. They received wine A or wine B, and it was a ‘crossover design’ – some people got wine A first, and some people got wine B first, in case the order you got them in affected your palate and your preferences.

There was no statistically significant difference in whether people expressed a preference for the magnetised wine or the non-magnetised wine. To translate back to the language of commercial claims: people couldn’t tell the difference between magnetised and non-magnetised wine. I realise that might not come as a huge surprise to you. But the real action is in the conclusions: ‘Practitioners of unconventional interventions often cite cost as a reason for not carrying out rigorous assessments of the effectiveness of their products. This double-blind randomised cross-over trial cost under £70 to conduct and took one week to design, run and analyse. Its simplicity is shown by the fact that it was run by two sixteen-year-old work experience students (EA and RI).’

‘Unfortunately,’ they continue, ‘our research leaves us no nearer to an understanding of how to improve the quality of cheap wine and more research into this area is now called for as a matter of urgency.’

BAD ACADEMIA

What If Academics
Were as Dumb as Quacks with Statistics?

Guardian
, 10 September 2011

We all like to laugh at quacks when they misuse basic statistics. But what if academics, en masse, make mistakes that are equally foolish?

This week Sander Nieuwenhuis and colleagues
publish a mighty torpedo
in the journal
Nature Neuroscience
. They’ve identified one direct, stark statistical error that is so widespread it appears in about half of all the published papers surveyed from the academic neuroscience research literature.

To understand the scale of this problem, first we have to understand the statistical error they’ve identified. This will take four hundred words. At the end, you will understand an important aspect of statistics better than half the professional university academics currently publishing in the field of neuroscience.

Let’s say you’re working on some nerve cells, measuring the frequency with which they fire. When you drop a particular chemical on them, they seem to fire more slowly. You’ve got some normal mice, and some mutant mice. You want to see if their cells are differently affected by the chemical. So you measure the firing rate before and after applying the chemical, both in the mutant mice, and in the normal mice.

When you drop the chemical on the mutant-mice nerve cells, their firing rate drops by, let’s say, 30 per cent. With the number of mice you have (in your imaginary experiment), this difference is statistically significant, which means it is unlikely to be due to chance. That’s a useful finding which you can maybe publish. When you drop the chemical on the normal-mice nerve cells, there is a bit of a drop in firing rate, but not as much – let’s say the drop is 15 per cent – and this smaller drop doesn’t reach statistical significance.

But here is the catch. You can say that there is a statistically significant effect for your chemical reducing the firing rate in the mutant cells. And you can say there is
no
such statistically significant effect in the normal cells. But you cannot say that mutant cells and normal cells respond to the chemical differently. To say that, you would have to do a third statistical test, specifically comparing the ‘difference in differences’, the difference between the chemical-induced change in firing rate for the normal cells against the chemical-induced change in the mutant cells.

Now, looking at the figures I’ve given you here (entirely made up, for our made-up experiment), it’s very likely that this ‘difference in differences’ would not be statistically significant, because the responses to the chemical only differ from each other by 15 per cent, and we saw earlier that a drop of 15 per cent on its own wasn’t enough to achieve statistical significance.

But in exactly this situation, academics in neuroscience papers are routinely claiming that they have found a difference in response, in every field imaginable, with all kinds of stimuli and interventions: comparing responses in younger versus older participants; in patients against normal volunteers; in one task against another; between different brain areas; and so on.

How often? Nieuwenhuis and colleagues looked at 513 papers published in five prestigious neuroscience journals over two years. In half the 157 studies where this error could have been made, it was made. They broadened their search to 120 cellular and molecular articles in
Nature Neuroscience
during 2009 and 2010: they found twenty-five studies committing this statistical fallacy, and not one single paper analysed differences in effect sizes correctly.

These errors are appearing throughout the most prestigious journals in the field of neuroscience. How can we explain that? Analysing data correctly, to identify a ‘difference in differences’, is a little tricksy, so thinking very generously, we might suggest that researchers worry it’s too long-winded for a paper, or too difficult for readers. Alternatively, perhaps less generously, we might decide it’s too tricky for the researchers themselves.

But the darkest thought of all is this: analysing a ‘difference in differences’ properly is much less likely to give you a statistically significant result, and so it’s much less likely to produce the kind of positive finding you need to get your study published, to get a point on your CV, to get claps at conferences, and to get a good feeling in your belly. In all seriousness: I hope this error is only being driven by incompetence.

Brain-Imaging Studies
Report More Positive Findings Than Their Numbers Can Support. This Is Fishy

Guardian
, 13 August 2011

While the authorities are distracted by mass disorder, we can do some statistics. You’ll have seen plenty of news stories telling you that one part of the brain is bigger, or smaller, in people with a particular mental health problem, or even a specific job. These are generally based on real, published scientific research. But how reliable are the studies?

One way of critiquing a piece of research is to read the academic paper itself, in detail, looking for flaws. But that might not be enough, if some sources of bias might exist outside the paper, in the wider system of science.

By now you’ll be familiar with publication bias: the phenomenon whereby studies with boring negative results are less likely to get written up, and less likely to get published. Normally you can estimate this using a tool such as, say, a funnel plot. The principle behind these is simple: big, expensive landmark studies are harder to brush under the carpet, but small studies can disappear more easily. So essentially you split your studies into ‘big ones’ and ‘small ones’: if the small studies, averaged out together, give a more positive result than the big studies, then maybe some small negative studies have gone missing in action.

Sadly, this doesn’t work for brain-scan studies, because there’s not enough variation in size. So Professor John Ioannidis, a godlike figure in the field of ‘research about research’,
took a different approach
. He collected a large representative sample of these anatomical studies, counted up how many positive results they got, and how positive those results were, and then compared this to how many similarly positive results you could plausibly have expected to detect, simply from the sizes of the studies.

This can be derived from something called the ‘power calculation’. Everyone knows that bigger is better when collecting data for a piece of research: the more you have, the greater your ability to detect a modest effect. What people often miss is that the size of the sample needed also changes with the size of the effect you’re trying to detect: detecting a true 0.2 per cent difference in the size of the hippocampus between two groups, say, would need more subjects than a study aiming to detect a huge 25 per cent difference.

Other books

Dash in the Blue Pacific by Cole Alpaugh

Regency 05 - Intrigue by Jaimey Grant

Seasoned with Grace by Nigeria Lockley

Once a Cowboy by Linda Warren

The Bet by J.D. Hawkins

Unveiled by Trisha Wolfe

City Boy: My Life in New York During the 1960s and '70s by Edmund White

A Wolf in Sheep's Clothing by Joan Johnston

Irons (Norfolk #1) by Mj Fields

The Doctor and Mr. Dylan by Rick Novak