effectiveness reviews’), case studies, and the accumulated wisdom of our big cheeses. But the tacit hierarchy of these different kinds of knowledge worried me – anything with a number attached had a privileged position, however partial the number or questionable the process for arriving at it. In contrast, decades of experience were not even credited as ‘evidence’, but often written off as ‘opinion’. It felt like we were in danger of discounting our richest source of insight – gut feeling. In this state of discomfort, I went off for lunch with Lant Pritchett (right – he seems to have forgiven me for my screw-up of a couple of years ago). He’s a brilliant and original thinker and speaker on any number of development issues, but I was most struck by the vehemence of his critique of the RCT randomistas and the quest for experimental certainty. Don’t get me (or him) wrong, he thinks the results agenda is crucial in ‘moving from an input orientation to a performance orientation’ and set out his views as long ago as 2002 in a paper called ‘It pays to be ignorant’, but he sees the current emphasis on RCTs as an example of the failings of ‘thin accountability’ compared to the thick version. In a forthcoming paper (which I will definitely link to when it’s published), Lant defines thick accountability as ‘an “account” in the sense of a justificatory narrative of my actions, the story of my actions I tell to those whose opinion of me is important (including myself, but including family and kinsmen, friends, co-workers, co-religionists, people I respect and desire admiration from) that explains why my actions are in accord with, and deserving of, a positive view of myself. In contrast, thin accountability is “accounting”, which is that small part of the account about which objective facts can be established.’ He sketched out the inevitable 2×2 matrix for me
Thin accountability Low performance e.g. fragile states | Thin accountability High performance e.g. post office and road-building |
Thick accountability Low performance e.g. families and other non-performance oriented institutions | Thick accountability High performance e.g. just about any complex institutional ecosystem |
- The politics of RCTs: ‘RCTs are a tool to cut funding, not to increase learning.’ ‘Randomization is a weapon of the weak’ – a sign of how politically vulnerable the argument for aid has become since the end of the Cold War. ‘Henry Kissinger wouldn’t have demanded an RCT before approving aid to some country.’ And I can’t see the military running RCTs to assess the value for money of new weaponry before asking for more cash (mind you, if they did, that might at least save some money on Trident….).
- The lack of interest in theory: ‘the randomistas are going back to alchemy – atheoretic experimentation’.
- RCTs test at most a few project variants using ‘project vs non-project’, whereas interventions are typically multiple, overlapping and synergistic (i.e. the whole cannot be reduced to a sum of parts).
- No-one evaluates the evaluators. At the very least, given how much RCTs cost, you need to know that the findings are useful elsewhere (so-called ‘external validity’). But once you have multiple RCTs on the same issue (and their spread is starting to produce such comparable studies), you find very little external validity – the results of an RCT in one country and time are not replicated elsewhere (with the possible exception of deworming in schools, but even that iconic RCT story is contested). This is the big contrast with real science, where replicability is a key condition of validity.