Friday, 2 November 2012

Nate Silver and the war against statistical truth


Amongst all the noise generated by the US Presidential election (and we still don't know who's going to win) is a very, very worrying drumbeat that's pumping away just under the surface.

It's a war, waged by ultra-conservative commentators. And it's a fight against numbers - even (if you'll forgive the rather embarrasing show of emotion) against the truth.

Their nastiest bile - and some of it has been really nasty and disgusting stuff - is aimed at Nate Silver (above), hero of baseball and poker predictors everywhere, and the author of the New York Times' 538 blog. He's a man who's made his name applying statistical models of the data to predict outcomes - results that he's usually got right. Now he's being accused of arrogantly 'assuming' what will happen in the end, being privy to secret data from the Obama camp, weighting pollsters' results how he likes, and much else abusive and slanderous rubbish. It's all part of a wider picture, of course - that elements of conservative America are spinning off into their own 'post-truth' politics, in which they can question anything they like (so long as it's not their own preconceptions). A world in which they try to undermine voting machines' reputations before ballots can even be cast - though of course Democrats have long had their own conspiracy theories about the voting machines that were used in the 2004 elections.

This stuff matters. You have to understand numbers and concepts of scale to understand the world around you. Of course numbers are constructed, shifting, uncertain - as this column has argued and accepted on many occasions. But that doesn't mean that they're all as good as one another. Still less does it mean that you don't have to be able to put them in context, mobilising actual knowledge (the outbreak of italics is a metaphorical jab in the chest with a pointed finger, I know - but hey, I'm angry).

So, for instance... Republicans have been arguing that the fact that they're ahead in the polls with 'independent' voters, who aren't registered as Democrats or Republicans, means that they're going to do well next week. Which they might. But it doesn't necessarily mean this - because many conservatively-minded voters may now be listing themselves as 'Independents' as 'their' party zig-zags ever more crazily to the far right. Take another example. Republicans have been arguing that they are doing well in Ohio, based on the fact that early turnout in their countries has been very good. But get right down to the ward level, and things look rather different, with Democrats turning out well. It may well be that right-wing journalists have not dug deeply enough into the grainy mess of the raw numbers here.

And so on. And on.

There's one very telling set of details buried in all this - and it's that Silver uses the methods of social science, not the impressions of pundits or commentators. He is clear that he is not predicting that Obama will actually win: he's just saying that it's pretty likely, on the basis of the numbers that we now have, and compared to the alternative. And he's not saying that this won't be close. He's simply saying that, while close, the balance of evidence for predicting victory is (delicately but definitely) on one side and not the other. More moderate conservative voices accept all this, of course, while pointing out that their real problem is with the 'utopian' idea that models can predict the unfolding of the future.

That cod-philosophy to one side, it's important to note that Silver publishes his methods - as do other reputable number-crunchers trying to turn leads or deficits into predictions. He draws on accepted mathematical proofs. He's not alone, with other prediction sites and (something I put a lot of weight on) gambling markets also giving President Obama somewhere between a two-thirds and a 96 per cent chance of victory. He accepts that the eventual result on Tuesday night might be different from what his numbers suggest. They're suggesting a likelihood, and 80 per cent saying he'll win from a model doesn't mean he will. 20 per cent changes come off all the time (well, they come off a fifth of the time). He doesn't change his views (or his model) in mid-stream. He uses data - not the gut feelings that pull us all over the place all the time.

You know what? I prefer his approach to that of the people chucking abuse at him. Call me old-fashioned.