I Think You'll Find It's a Bit More Complicated Than That (10 page)

BOOK: I Think You'll Find It's a Bit More Complicated Than That
10.18Mb size Format: txt, pdf, ePub

I’m sorry to see academics unblameless in this dreary situation.

Voices of the Ancients

Guardian
, 16 January 2010

Every now and then you have to salute a genius. Both the
Daily Mail
and
the
Metro
report new research analysing the positions of Britain’s ancient sites, and the results are startling: primitive man had his own form of ‘sat nav’. Researcher Tom Brooks analysed 1,500 prehistoric monuments, and found them all to be on a grid of isosceles triangles, each pointing to the next site, allowing our ancestors to travel between settlements with pinpoint accuracy. The papers even carried an example of his map work, which I have reproduced here.

That this pattern could occur simply because one site was on the way to the next was not considered. Mr Brooks has proven, he explains, that there were keen mathematicians here 5,000 years ago, millennia before the Greeks invented geometry: ‘Such is the mathematical precision, it is inconceivable that this work could have been carried out by the primitive indigenous culture we have always associated with such structures … all this suggests a culture existing in these islands in the past quite outside our expectation and experience today.’ He does not rule out extraterrestrial help.

In the
Metro
Tom Brooks is a researcher. To the
Daily Mail
he is a researcher, a historian and a writer. I hope it’s not rude or unfair for me to add ‘retired marketing executive of Honiton, Devon’.

Matt Parker
, his nemesis, is based in the School of Mathematical Sciences at Queen Mary, University of London. He has
applied the same techniques
used by Mr Brooks to another mysterious and lost civilisation.

‘We know so little about the ancient Woolworths stores,’ he explains, ‘but we do still know their locations. I thought that if we analysed the sites we could learn more about what life was like in 2008 and how these people went about buying cheap kitchen accessories and discount CDs.’

The results revealed an exact and precise geometric placement of the Woolworths locations. ‘Three stores around Birmingham formed an exact equilateral triangle (Wolverhampton, Lichfield and Birmingham stores), and if the base of the triangle is extended, it forms a 173.8-mile line linking the Conwy and Luton stores. Despite the 173.8-mile distance involved, the Conway Woolworths store is only forty feet off the exact line and the Luton site is within thirty feet. All four stores align with an accuracy of 0.05 per cent.’

Matt Parker used an ancient technique: he found his patterns in eight hundred ex-Woolworths locations by ‘skipping over the vast majority, and only choosing the few that happen to line up’.

With 1,500 locations, Mr Brooks had almost twice as much data to work with, and on this issue Parker is clear: ‘It is extremely important to look at how much data people are using to support an argument. For example, the case for global warming covers vast amounts of comprehensive evidence, but it is still possible for people to search through the data and find a few isolated examples that appear to show otherwise.’

BIG DATA

There’s Something Magical
About Watching Patterns Emerge from Data

Guardian
, 11 June 2011

We all know that one atom of experience isn’t enough to spot a pattern: but when you put lots of experiences together and process that data, you get new knowledge. This might sound obvious, but following it through – watching patterns emerge from the noise – still gives me a sense of beauty and awe.

A paper in the
British Medical Journal
this week is a perfect example. Medicine is an imperfect art, so it’s inevitable that healthcare workers will make some suboptimal decisions: not so much the dramatic stuff – injecting people with the wrong drug – but more the marginal decisions, at the edges of the tweaks in a patient’s journey, affecting outcomes in ways that are harder to predict.

These kinds of complex decisions will inevitably be affected by context, and one example of that context is the franticness of A&E. Waiting times are a problem in a lot of countries. In the UK we introduced a four-hour ceiling as our target, and most hospitals met it. Abolishing that four-hour target was one of the coalition government’s
first NHS reforms
. But do waiting times matter?

Some researchers in Canada decided to find out. They gathered data from all the people who visited any A&E department in Ontario over a five-year period: this gave them data on a dizzying 22 million visits. Of these, 14 million resulted in the patient being seen and then sent home. Then they followed these patients up to see what happened, and specifically, to see if they died.

They also had another piece of information: for each patient they knew, from internal hospital data, what the average waiting time in A&E was at the time they arrived. This means that they were able to compare the odds of death for patients discharged when the average wait in A&E was less than four hours (or more) against the odds of death for patients discharged when the wait was less than one hour. Remember, this isn’t the time that individual patient waited, it’s the average wait in the department, as a proxy for how frantic things were.

The results were as you might fear. For patients sent home who attended an A&E department when the average wait there was more than six hours, the odds of death were almost twice those of patients sent home when the wait was less than one hour. This odds ratio was similar for patients measured as high or low urgency at triage, so it’s true for patients with both serious and less serious presentations.

Even more starkly, there’s a very clear trend in the data, where each step up in waiting time results in a higher risk of death. This becomes statistically significant when average waits reach just three hours. For those who care about saving money, the odds of being admitted – and so taking up an expensive hospital bed – also rose dramatically as average wait time increased.

However important you might find those specific results, the methodological issues are much more interesting, and they all arise because of the big numbers involved. We would never have discovered any of this without huge numbers of patients’ records, because the outcomes involved are rare: you only see a handful of deaths out of every 10,000 people sent home from A&E.

What’s more, because they had so many patients’ data, the researchers were able to see an effect even within hospitals, over time: so it wasn’t just that crap hospitals overall had longer waits, and higher death rates. What’s more, amazingly, they didn’t lose a single patient during follow-up: the death – or otherwise – of every single patient who was sent home from A&E could be tracked through their notes.

No individual patient or doctor could possibly have shown with any certainty, from their own personal experience of any one adverse outcome, that long waiting times in A&E are dangerous. This study is a remarkable testament to the power of good-quality computerised health records, and the kinds of new knowledge you can generate from interrogating them. It’s also, I’ll agree, a pretty frightening result.

Give Us the Data

Guardian
, 7 October 2011

Bad things happen when problems are protected by a force field of tediousness. Here is an example. Data is the fabric of the modern world: just as we walk down pavements, so we trace routes through data, and build knowledge and products out of it. The government has lots of data that has already been collected, because it was needed to run the country properly: simple stuff like maps, postcode areas, land ownership, procurement data, endless weather readings, and so on.

Right now a fight is happening in Whitehall between two factions in government: one group thinks we should give this data away for free, as a matter of principle, because it will make good things happen; the other thinks we should restrict access, and sell it.
A consultation is under way
. Despite a positive ministerial introduction, each of the three options it gives for releasing data is
foolishly restrictive
. Here’s why that’s a problem.

As things stand, much
everyday government data
is locked down so hard that nerds are
forbidden to repurpose it
. You could have a map of who owns what in your town, on your screen, at a click. You could find out what company boards someone sits on, and map their relationships and overlaps with all the other directors in the country. You could download transcripts of court proceedings that affect you. All this is blocked by the government’s restrictive data policies.

There are areas where access has been won by the shame of a simple moral argument. Hansard is a record of everything that happens in Parliament.
TheyWorkForYou.com
is a repurposing of that data which adds huge value, not just by being more usable than Hansard, but by identifying patterns in MPs’ voting behaviour. When it first came out, Hansard argued – embarrassingly – that this was an illegal breach of copyright.

But there are also straight commercial applications. If you’re making services or things that you sell to government, then seeing what they use and need helps you sell them stuff. That data is even internally useful: if you can see what everyone else is paying for toilet paper, you might get a better deal for your own department.

All this data has to be created, regardless of whether or not it gets sold, simply in order to run the country. You could ‘sweat the asset’, and charge money for access; but if you release it for free, at barely any cost to yourself, without fiddliness, in its raw form, the benefits are potentially huge.

This becomes especially clear when you notice how the restrictions extend beyond specific realms of data, and into the kind of core structural information that is needed as a civic skeleton for simple, everyday activity. The Royal Mail still owns
all our postcode information
, and you can’t get
the house-number boundaries
of each specific postcode without paying. All the most interesting data projects involve linking one dataset with another, and for addresses, that often means using postcodes, as a commonly used structural spine (I’m willing to bet that you don’t know your house’s latitude and longitude). This kind of framework data is the pavement of data space, and if you’re not allowed to use it, projects go unmade.

The economic loss is almost impossible to measure: if any of the projects I’ve already described sound trivial to you, remember that this is a crippled field, where innovators have barely had a chance to get their eyes in. Amazing things happen when you pull individual pieces of information together into larger linked datasets: meaning emerges, as you produce facts from figures. If you’ve ever wished you were born in the nineteenth century, when there were so many obvious inventions and ideas to hook for yourself, then I seriously recommend you become a coder, because future nerds will look back on this time with the exact same envy. But that leap forward will be tediously retarded if we don’t
make the government
allow us to use the pavements.

Care.data Can Save Lives: But Not If We Bungle It

Guardian
, 21 February 2014

Everything would be much simpler if science really was ‘just another kind of religion’. But medical knowledge doesn’t appear out of nowhere, and there is no ancient text to guide us. Instead, we learn how to save lives by studying huge datasets on the medical histories of millions of people. This information helps us identify the causes of cancer and heart disease; it helps us spot side effects from beneficial treatments, and switch patients to the safest drugs; it helps us spot failing hospitals, or rubbish surgeons; and it helps us spot the areas of
greatest need in the NHS
. Numbers in medicine are not an abstract academic game: they are made of flesh and blood, and they show us how to prevent unnecessary pain, suffering and death.

Other books

1491 by Mann, Charles C., Johnson, Peter (nrt)
Nightswimmer by Joseph Olshan
Fire and Ice by Sara York
Lucien Tregellas by Margaret McPhee
The Origin of Sorrow by Robert Mayer
Killing Capes by Scott Mathy
Cool Campers by Mike Knudson