I’m not generally a big fan of measurement fetishism (too crude, too blind to complexity and systems thinking). When I used to (mis)manage the Oxfam research team and wanted a few thousand quid for some research grant, I had to list numbers of beneficiaries (men and women). As research is a global public good, I always put 3.5bn of each. No-one ever queried it.
But things have improved a bit since then (not least for the research team) and I’m starting to be won over by some very interesting work going on in the bowels of Oxfam House, albeit skillfully camouflaged under a layer of development speak.
For the last 3 years, Oxfam has been running a ‘Global Performance Framework’ (GPF) to try and sharpen up how it measures the impact of its work. This summer, the organisation engaged an external consultant to review the GPF. Our chief measurement guru Claire Hutchings discusses the review and next steps in ‘Balancing Accountability and Learning: A review of Oxfam GB’s Global Performance Framework’ out now in the Journal of Development Effectiveness (shamefully, it’s gated, but you can download it on Oxfam’s Policy and Practice website). Despite the less-than-gripping title, it’s worth a wade (skimming is not really an option), but here’s a sneak preview of some of its findings.
The GPF addressed two challenges facing the organisation: ‘how to access credible, reliable feedback on whether interventions are making a meaningful difference….. and how to ‘sum’ this information up at an organisational level.’
So how do you capture and communicate the effectiveness of 1200+ projects on everything from life’s basics – food, water, health and education – to complex questions around aid, climate change and human rights?
To their credit, Oxfam staff recognized that ‘requiring all programmes to collect data on pre-set global outcome indicators wasn’t the answer, that it had the potential to distort programme design, and would be at odds with the value Oxfam GB places on developing programmes “bottom-up”, based on robust analyses of how change happens in the contexts in which it is working.’
And even if you did try and collect such data, at the end of the day, they would only tell you that change was happening in the contexts in which Oxfam was working. Attributing any given improvement in people’s lives to a particular intervention by Oxfam is incredibly difficult, especially in areas such as empowerment and rights. A principle that informed the design of the GPF was that evaluations needed to understand Oxfam’s contribution (or not) to change.
So the GPF takes a two-pronged approach: measuring and summing up outputs (the stuff we did) to understand the diversity and scale of the portfolio of work we’re delivering, but also undertaking evaluations of a random sample of closing or mature projects to understand the outcomes (the changes in lives of poor people) of these efforts.
This second string in the GPF bow was designed to add some scary rigour, in the shape of ‘effectiveness reviews’, which use a range of ‘proportionate’ methodologies to measure impact, including quasi-experimental designs for community level development programmes which consider the counter-factual using advanced statistical methods such as propensity score matching and multivariable regression to control for any measured differences between intervention and comparison populations; and a qualitative causal inference method known as process tracing, where there are too few units to permit tests of statistical significance between treatment and a comparison group (so-called ‘small n problems). Phew (wipes brow, wonders if anyone’s still reading).
The GPF also considers the quality of some of our interventions, examining the performance of selected humanitarian responses against 13 quality benchmarks, and assessing the degree to which randomly selected projects meet Oxfam’s accountability standards to partners and communities.
So far, 74 Effectiveness Reviews have been completed, with a commitment to publishing the results, warts and all. They cost between £15-40,000 each, depending on the methodology (and including staff time). That includes the latest batch, the first of which are published today (hence this post). 3 years in, what have we learned?
Working out how to measure ‘Hard to Measure Benefits’ or HTMB (a new addition to the great tradition of development acronyms/jargon) is, well, hard. Kudos to the organisation for not balking at the measurement challenge, or focusing on the easy-to-measure, but we’ve spent a good part of the first three years just working out how to measure the outcomes we’re interested in evaluating, (eg women’s empowerment, resilience), building on the work of others in the sector and learning by doing.
By generating sharper methodologies, Effectiveness Reviews have great potential for improving the rest of our evaluation work – the ones that individual projects/programmes undertake anyway, often to meet donor requirements – but progress has been slow.
Which brings us to the wider point. Riding two horses is difficult, and often painful: there are tensions between (upwards) accountability and learning, with the former crowding out the latter (to some extent). We get donor brownie points for having both global numbers and rigorous project evaluations, but we don’t make the most of the consequent learning.
We’re doing reasonably well at project level, because staff are involved both in the reviews, and in responding to their findings, and there is evidence that they’re making changes to project design and delivery as a result. But at the broader organizational level, with the focus on the measurement challenge and upward accountability, we have not yet digested what this body of evaluations is telling us about Oxfam’s portfolio, or systematically spread the learning across the whole of Oxfam’s work, beyond some limited osmosis via global advisors on particular issues (which by the way is pretty much the same story as a recent review found at DFID). Nor have we fully digested what this body of evaluations is telling us about Oxfam’s portfolio. This will become easier as the number of completed effectiveness reviews grows, allowing more cross comparisons between projects in similar fields, but there is clearly still lots to do.
This was a complex challenge. We needed to start somewhere, and have learned a lot by getting stuck in, adapting the process to better serve a learning agenda along the way. The challenge for the next phase of the GPF is to give more attention to the virtuous links between results and organisational learning – to not only deliver credible results, but to use them to inform our work. In the meantime, the more recent effectiveness reviews are published today, so why not unleash your inner wonk and download a couple?