Cross-posted at mathbabe.org.
Nate Silver’s high-profile success in predicting the 2012 election has triggered a wave of articles on the victory of data analysts over pundits. Cathy has already taken on the troubling aspects of Silver’s celebrity, so I’d like to focus instead on the larger movement toward big data as a replacement for traditional punditry. It’s an intriguing idea, especially given the sad state of political punditry. But rather than making things better, it’s entirely possible that the methods these articles propose could make things even worse.
There’s no question that we need better media, especially when it comes to politics. If we take the media’s role to be making sure that voters are informed, then they’re clearly doing a poor job of it. And one of the biggest problems is that political coverage has largely abandoned any pretense of getting to the truth in favor of “he said/she said” and endless discussion of the horse race, with the pundits being the worst offenders. Instead of “Will this be good for citizens?” we get “Will this be good for the Democrats/Republicans in the next poll?”
This is where the big data proposals enter the picture, and where I think they go wrong. Rather than addressing the accuracy or usefulness of the information being provided to us as voters, or working to shift the dialogue away from projections of how a given policy will play in Iowa, the proposals for big data revolve around replacing pundits’ subjective claims about shifting perceptions with more objective analysis of shifting perceptions.
For example, this piece from the Awl convincingly describes the potential for the rapid analysis of thousands or even millions of articles as a basis for more effective media criticism, and as a replacement for punditry by “anecdata.” A more recent post from the Nieman Journalism Lab at least acknowledges some methodological weaknesses even as it makes a very strong case for large-scale sentiment analysis as a way of “getting beyond pundits claiming to speak for others.” By aggregating and analyzing the flow of opinion across social media, the piece argues, journalism can deliver a more finely tuned representation of public opinion.
It’s true that perceptions in a democracy matter a lot. But it’s also true that getting a more accurate read on perceptions is not going to move us toward more informative coverage, let alone toward better politics. Worse still, these proposals ignore the fact that public perception is heavily affected by media coverage, which implies that pulling public perception more explicitly into the coverage itself will just introduce reflexivity rather than clarification.
In other words, we could end up with a conversation about the conversation about the conversation about politics. Is that really what we need?
As I see it, there are two precedents here, neither of which is encouraging. Financial markets have been treated as a source of perfect information for a very long time. The most famous justification for this was Hayek’s claim that the price system inherent in markets acts as “a system of telecommunications” that condenses the most relevant information from millions of agents into a single indicator. Even if we accept this as being true when Hayek wrote his essay in 1945 (which we shouldn’t), it’s certainly not true now. That’s in part because financial markets have attracted more and more speculators who base their decisions on their expectations of what others will do rather than introducing new information. So rather than informational efficiency, we get informational cascades, herding and periodic crashes.
The other example is consumer markets, which have the most experience with sentiment analysis for obvious reasons. In fact, this analysis is only the latest service offered by an enormous industry of advertising, PR and the like that exists solely to engineer and harness these waves of sentiment and perception. Their success proves that perception doesn’t exist in some objective void, but is closely shaped by the process of thinking about and consuming the very products it’s attached to. Or to be wonky about it, preferences can be more endogenous than exogenous in a consumer society.
Which is ultimately my point. If we want to treat the information provided by the media – the primary source of information for our democracy – as a more and more finely tuned consumer good whose value is determined by how popular it is, then this sort of analysis is emphatically the way to go. But we should not be surprised by the consequences if we do.