Big Data and Development: Upsides, downsides and a lot of questions

One of the more scary but enjoyable things I do is be interviewed on stuff I know absolutely nothing about (yeah, yeah, I know – no change there then). You get to grasshopper around multiple issues and

disciplines, cobbling together ideas and arguments from scattered fragments, making connections and learning new stuff. Great fun. This week, I’ll blog about a couple of these BS (blue sky, of course) sessions to give you a flavour.

First up, half an hour discussing ‘big data’ with a friend/researcher who shall be nameless (don’t want to destroy their reputation). Here’s some of the points that arose:

First, massive confusion on definition: depending on who you talk to, ‘Big Data’ means scraping massive amounts of existing data from sources like Facebook and Twitter (the UN’s Global Pulse has done great work on this); generating large volumes of new data; computer modelling of the existing data or using it for particular purposes like transparency and accountability, or targeting humanitarian relief.

Big Data is great when it throws up new questions and correlations, and stimulates thinking and discussion (The Economist had a fascinating piece this week on how Big Data is a natural partner of iterative, experimental approaches to change). But I’m dubious about it providing a short cut to political change, empowerment etc. There are lots of inspiring examples of using data to promote social change, but plenty of caveats and warnings against magic bulletism too, as the recent contributions to this blog from 3 transparency and accountability gurus showed.

How will Big Data evolve? It may follow the path of governance work – starting off with lots of supply (people building crowdsource websites that no-one uses), when that doesn’t work, move on to demand (citizens’ movements demanding data from baffled/incompetent/hostile governments) and then end up looking for combos of the two – hybrid institutions for data that combine old and new systems in new, context-specific ways; getting lots of unusual suspects in a room to find tailored solutions, including data-based ones, to agreed problems.

And what about the downsides? What are the risks of Big Data?

Data tribalism: mass media gives way to tribal media, as everyone splits off into their own online echo chamber, and increasingly has no idea what the rest of the world is thinking.

Big Brother Data: around the world,
governments are trying to close down space for civil society. CSOs routinely use a lot of IT, which provides a perfect channel for snooping and repression.

Don’t assume libertarianism will persist. Ok the internet still reflects its libertarian origins, but what if it is taken over by bad guys, whether governments or corporate?

What’s the link to inequality? Does the top 1% of the digitally connected have access to x times more data than the poorest 10% and is that digital divide growing or shrinking, between and within countries? What are the knock-on effects in terms of power and wealth?

Which all leads to a broader question. Is there something inherently individualist about the acquisition and use of data, as currently conceived? There are signs that it undermines collectivism, for example by allowing what were once pooled risks (eg National Health Service) to become customised, and eventually fragmented (her risk is bigger than mine, so why should I cross subsidise her with my taxes). If so, is a collectivist alternative approach– i.e. collective acquisition and access, data even conceivable?

Possible implications for today’s developing countries:

Big Data could of course allow them to leapfrog the painfully slow business of building solid national statistical capacity. A bit like mobiles v landlines.

Does building their data capacity in a world ruled by outside multinationals require a data equivalent of industrial policy? Perhaps countries should protect and nurture their infant data-related industries, only opening data ‘borders’ when national capacity and competitiveness has been created: a data equivalent of the East Asian tigers. But that would seem to go against any push for data comparability.

It feels like the governance of data is going to become ever-more important as a global issue. Who owns it? When can it be bought and sold? Do we need a UN Convention on

Self explanatory, really

Access and Use of Information to try and lock in some positive norms around its usage?

I think I can guarantee that most, if not all, of this is complete nonsense, but I’d be interested to hear if anything resonates with data people

Next up: how could political institutions emerge that govern for future generations?

Update: Alan Hudson recommends ‘The rise of data and the death of politics‘, an excellent example of big data as dystopia, by Evgeny Morozov

Big Data and Development: Upsides, downsides and a lot of questions

Comments