Iâm not generally a big fan of measurement fetishism (too crude, too blind to complexity and systems thinking). When I used to (mis)manage the Oxfam research team and wanted a few thousand quid for some research grant, I had to list numbers of beneficiaries (men and women). As research is a global public good, I always put 3.5bn of each. No-one ever queried it.
But things have improved a bit since then (not least for the research team) and Iâm starting to be won over by some very interesting work going on in the bowels of Oxfam House, albeit skillfully camouflaged under a layer of development speak.
For the last 3 years, Oxfam has been running a âGlobal Performance Frameworkâ (GPF) to try and sharpen up how it measures the impact of its work. This summer, the organisation engaged an external consultant to review the GPF. Our chief measurement guru Claire Hutchings discusses the review and next steps in âBalancing Accountability and Learning: A review of Oxfam GBâs Global Performance Frameworkâ out now in the Journal of Development Effectiveness (shamefully, itâs gated, but you can download it on Oxfam’s Policy and Practice website). Despite the less-than-gripping title, itâs worth a wade (skimming is not really an option), but hereâs a sneak preview of some of its findings.
The GPF addressed two challenges facing the organisation: âhow to access credible, reliable feedback on whether interventions are making a meaningful difference….. and how to âsumâ this information up at an organisational level.â
So how do you capture and communicate the effectiveness of 1200+ projects on everything from life’s basics – food, water, health and education – to complex questions around aid, climate change and human rights?
To their credit, Oxfam staff recognized that ârequiring all programmes to collect data on pre-set global outcome indicators wasnât the answer, that it had the potential to distort programme design, and would be at odds with the value Oxfam GB places on developing programmes âbottom-upâ, based on robust analyses of how change happens in the contexts in which it is working.â
And even if you did try and collect such data, at the end of the day, they would only tell you that change was happening in the contexts in which Oxfam was working. Attributing any given improvement in peopleâs lives to a particular intervention by Oxfam is incredibly difficult, especially in areas such as empowerment and rights. A principle that informed the design of the GPF was that evaluations needed to understand Oxfamâs contribution (or not) to change.
So the GPF takes a two-pronged approach: measuring and summing up outputs (the stuff we did) to understand the diversity and scale of the portfolio of work weâre delivering, but also undertaking evaluations of a random sample of closing or mature projects to understand the outcomes (the changes in lives of poor people) of these efforts.
This second string in the GPF bow was designed to add some scary rigour, in the shape of âeffectiveness reviewsâ, which use a range of âproportionateâ methodologies to measure impact, including quasi-experimental designs for community level development programmes which consider the counter-factual using advanced statistical methods such as propensity score matching and multivariable regression to control for any measured differences between intervention and comparison populations; and a qualitative causal inference method known as process tracing, where there are too few units to permit tests of statistical significance between treatment and a comparison group (so-called âsmall n problems). Phew (wipes brow, wonders if anyoneâs still reading).
The GPF also considers the quality of some of our interventions, examining the performance of selected humanitarian responses against 13 quality benchmarks, and assessing the degree to which randomly selected projects meet Oxfamâs accountability standards to partners and communities.
So far, 74 Effectiveness Reviews have been completed, with a commitment to publishing the results, warts and all. They cost between ÂŁ15-40,000 each, depending on the methodology (and including staff time). That includes the latest batch, the first of which are published today (hence this post). 3 years in, what have we learned?
Working out how to measure âHard to Measure Benefitsâ or HTMB (a new addition to the great tradition of development acronyms/jargon) is, well, hard. Kudos to the organisation for not balking at the measurement challenge, or focusing on the easy-to-measure, but weâve spent a good part of the first three years just working out how to measure the outcomes weâre interested in evaluating, (eg women’s empowerment, resilience), building on the work of others in the sector and learning by doing.
By generating sharper methodologies, Effectiveness Reviews have great potential for improving the rest of our evaluation work â the ones that individual projects/programmes undertake anyway, often to meet donor requirements â but progress has been slow.
Which brings us to the wider point. Riding two horses is difficult, and often painful: there are tensions between (upwards) accountability and learning, with the former crowding out the latter (to some extent). We get donor brownie points for having both global numbers and rigorous project evaluations, but we donât make the most of the consequent learning.
Weâre doing reasonably well at project level, because staff are involved both in the reviews, and in responding to their findings, and there is evidence that theyâre making changes to project design and delivery as a result. But at the broader organizational level, with the focus on the measurement challenge and upward accountability, we have not yet digested what this body of evaluations is telling us about Oxfamâs portfolio, or systematically spread the learning across the whole of Oxfamâs work, beyond some limited osmosis via global advisors on particular issues (which by the way is pretty much the same story as a recent review found at DFID). Nor have we fully digested what this body of evaluations is telling us about Oxfamâs portfolio. This will become easier as the number of completed effectiveness reviews grows, allowing more cross comparisons between projects in similar fields, but there is clearly still lots to do.
This was a complex challenge. We needed to start somewhere, and have learned a lot by getting stuck in, adapting the process to better serve a learning agenda along the way. The challenge for the next phase of the GPF is to give more attention to the virtuous links between results and organisational learning – to not only deliver credible results, but to use them to inform our work. In the meantime, the more recent effectiveness reviews are published today, so why not unleash your inner wonk and download a couple?