Scott Guggenheim, one of the better known names in Community Driven Development (CDD), comes out with a take-no-prisoners critique of the critique of CDD by 3ie (apologies for acronym overload), featured in my recent post. It’s long, but I just couldn’t find places to cut it.
Duncan obviously thrives on controversy, so he’s asked me to adapt my comment on 3ie’s recent evaluation of CDD Projects into a full-fledged blog post.
As someone who is part of a community that is nurturing CDD through some growing pains to see what works and what doesn’t, I like hearing critiques of the approach from which we can learn. Useful evaluations, even the most critical ones, have been those that couple careful interrogation of primary data with a commitment to advancing a broader objective of making public policy more aligned with the everyday problems of poor people in the developing world.
But as someone who either led or was closely involved with several of the projects that 3ie was reviewing, I was pretty dismayed by the selective use of evidence and some of the odder interpretations in this report. Here’s why.
The 3ie paper suffers from three deep flaws. First, it is working from a very small sample of studies, even after inflating the number by combining social funds with CDD and calling them all CDD. This matters because their sample universe combines wildly disparate programs, from huge, multi-year national programs funded through the government’s budget, to relatively small, one-off projects that have little to do with public policies or investment. Their sample’s community grant-making frequency, level and interval times – how often communities actually got funds, how much they receive, and how long afterwards results were measured in the field – also vary by very large amounts, again a problem that matters a lot when trying to draw conclusions about cumulative impacts and outcomes. Second, for a systematic, mixed methods review, there’s a lot of unsystematic, highly selective use of the available evidence. The review has to do this because as soon as we’re past the drama of its headlines, we suddenly find lots of qualifiers attached to each of its generalizations. These include their findings about health and education; distributional and welfare impacts; and the participation of women. The third flaw is the most important from any sort of policy or decision-making perspective, which is that, as Brian Levy commented in a very insightful post that everyone interested in this debate should read, the 3ie review is measuring CDD against “perfection” rather than “what are the realistic alternatives available?”
How comparable are the designs? The problem of constructing a dataset that does not compare apples to oranges is always difficult, but if the 3ie review is going to be making claims about public policy and CDD then it should at least confine its selection to projects that are CDD. Social funds have their pros and cons, but all social funds require that communities send proposals to some central review and approval agencies. CDD programs do not. The 3ie sample also includes projects funded by donors and executed by NGOs, but which operate entirely outside of government, which is undoubtedly fine and important, but then it seems naïve (and distorting) to use these as examples of not having spillovers into other arenas of public action.
Smushing together such disparate designs has a second problem. Defining the divisions between what the government does and what communities and NGOs do is the key design question for a CDD program. In her 2016 paper comparing Kenyan and Indonesian CDD projects, Jean Ensminger pointed out that CDD programs themselves vary a tremendous amount in ways that surely affect outcomes. This variance is particularly pronounced in almost every one of the areas that 3ie wants to measure: who handles procurement (government or communities); whether audits take place; if there is a working complaints handling mechanism; are facilitators qualified and trained or are they just civil servants tasked with another job? And so on. These distinctions do not appear to be relevant to 3ie. But they make or break a project.
Finally, any evaluation of this sort needs to be very careful with the quality and representativeness of its underlying studies. Andrew Beath’s otherwise excellent study of NSP, for example, which forms the basis for most of 3ie’s conclusions, covered 10 districts in just 6 of Afghanistan’s 34 provinces. Gosztonyi et. al studied NSP in 25 districts of Northeast Afghanistan, using a sample of 5,000 households, reached conclusions diametrically opposed to Beath’s. It’s not that one is right and the other wrong – what is wrong is for 3ie to generalize Beath’s findings to all of NSP. In general, this CDD evaluation literature suffers badly from weak underlying data – short intervals between baseline and endline; one-time measurements rather than long-term, repeated monitoring; “unfortunate events” such as droughts or armed conflict clouding results on irrigation or dispute resolution – in short, all of the problems that explain why good science (and policy) insists on multiple measurements and cross-checks.
Qualifying the headlines — The review has an annoying habit of making big announcements of dramatic findings that it then almost immediately qualifies. For example, “…programs have a weak effect on health outcomes and mostly insignificant effects on education and other welfare outcomes.” But the huge majority of CDD funds go into transport, clean water, and irrigation, not health or education, particularly once social funds are withdrawn from the sample, as they should have been. After boldly declaring that CDD has “no [statistically significant] impacts on health,” 3ie later notes that “the impact of the quality of health facilities is being measured for all treatment communities, including those where no investments were made in health.” (emphasis added)
How big is this average treatment effects issue? Let’s take a look. Afghanistan’s NSP CDD program, for example, covered 38,000 communities all across war-torn Afghanistan. NSP3, the program’s third phase, built 12,930 kms of rural roads, 39,449 clean water sites, 361,523 newly irrigated or rehabilitated hectares of irrigated land – and a grand total of 53 health centers. So what does this report focus on? The health centers. But in any case, why would anyone expect a village infrastructure program to produce changes to health or education, especially over the short term of most evaluation periods? For that to happen communities need teachers and books; nurses and medicines. What CDD offers is a way to build those health and education buildings at half the price, in places where the villagers actually want them to be located, and with virtually all of the spending on construction going to communities instead of to contractors (and officials).
Still on Afghanistan, 3ie writes that “An impact evaluation of the second phase of NSP in Afghanistan found mixed effects of the program on gender norms. Men’s acceptance of women in leadership in local and national levels had increased, as had women’s participation in local governance (emphasis added).” That’s a pretty big deal, not a “mixed effect!” 3ie then does go on to qualify this by saying they were referring to men’s attitudes towards women, but it’s certainly quite a trick to conclude that men now accepting women in leadership is not a change in men’s attitude towards women’s social roles.
The review’s claim that “CDD programmes have not had an overall impact on economic outcomes” – they only exempt Philippines Kalahi CIDSS from this claim – is simply put, false. The independent evaluation of Indonesia’s nationwide PNPM programs found consumption gains of 11 percentage points for the poorest. Sierra Leone’s GoBifo found gains in both household assets and market activity. BRA-KDP (Aceh) found an 11% decline in the share of villagers classified by village heads as poor, a doubling of land use for conflict victims and improvements in welfare (Adrian Morel, “Using CDD for Post-conflict reintegration: Lessons from the impact evaluation of the BRA-KDP program in Aceh,” Presentation to the Development Impact Evaluation Initiative (DIME), Dubai, June 1, 2010.).
The Atos cost-benefit study of Afghanistan’s NSP found that beneficiaries of land with improved irrigation reported crop yield increases of over 11 percent per harvest for their main crop, wheat, along with improvements in food availability for home consumption, while those who built roads found reductions in travel costs of 34 percent for goods, with an increase in the volume of goods transported of nine percent — and this in the middle of a rising conflict that was disrupting nearly all other forms of service delivery. It’s hard to see how 3ie can conclude from these independently measured findings that CDD has “an insignificant effect on…. welfare indicators.” In fact, K. Casey’s 2017 review of seven CDD operations that used randomized evaluation designs – including several of the same ones reviewed by 3ie – found improvements to household asset levels, employment, and market activity.
What are the alternative comparators? The third issue is that to be useful for policy makers, the most useful evaluation would have been to compare CDD programs against the next best available alternative. But for this kind of real-life assessment, the problem isn’t with what CDD measures, but the fact that such studies do not exist for the alternative options that 3ie is referring to. Saying that 57% of villagers knowing when the project meetings were taking place in the 9 CDD projects for which there is data illustrates “limited community” participation would mean something if we knew what the rates were for standard line projects – in fact, in the pre-KDP ethnographic studies we did in Indonesia, those numbers would have been stupendous.
In any case, if you can reduce unit costs in a country by 44%, as 3ie itself reports for the nationwide Indonesia KDP program, we don’t need another million-dollar, multi-year RCT to conclude that this is already something worth doing. This relative efficiency matters a lot when poor countries are trying to cover large numbers of poor and isolated areas. In several of the areas of their findings, there is available evidence on what the “next best” or “without” alternative looks like. The reason why Indonesia’s PNPM’s road and water projects cost so much less than local government ones is because there is less corruption and better oversight. The roads actually got built, which is why when Indonesia began requiring local government contributions after its 2004 decentralization, 398 of 400 districts paid in a matching grant from their own budgets. Getting women’s participation in public village meetings up to 30% in Afghanistan is surely not good enough, but it’s a lot better than standard development or traditional meetings, where it is at best less than 5% and in fact in most places, it is zero. And of course, it would be great if the women always spoke up more or had more decisive roles in each and every meeting, but you can be sure that their voices will remain at zero if they aren’t even in the room.
Susan and I tried to make three points in our own paper on CDD. First, CDD gives policy makers an important, useful new way to build large amounts of useful, small-scale productive infrastructure. It also gives poor communities some voice in how development funds are used. Second, CDD programs are not a homogenous category and for any evaluation, understanding the design differences between them is essential. And third, CDD works best when it is part of a broader strategy that can complement other programs and policy reforms as well as broader social changes taking place in society.
We also point to CDD’s limitations. Communities can build schools and clinics, but they can’t train and hire doctors and nurses. For that we will always need health and education programmes. And I don’t know of any serious CDD program practitioner who ever believed that handing out one or at two rounds of grants for infrastructure would translate into the magical creation of missing social capital, ethnic harmony, or a social movement – all of those require a different kind of political and social reform. In yet another study left out of the 3ie review, Woolcock, Barron and Diprose found “Ethnic, religious and class relations in NTT have improved since KDP was introduced, and these changes are greater in treatment than control areas….Further, improvements in the quality of group relations grow larger over time. Villages that have had KDP for four years show, in general, greater improvements than those that have had the program for shorter periods.”
CDD programs are also not by themselves the kind of job-creating, transformative activities – urbanization, industrialization, higher education, better health – that poor people increasingly need and deserve. But they can help poor people gain access to those things, and on a scale that reaches millions.
My main feeling after finishing the 3ie review was one of depression more than anything else. How is it even possible that three sets of reviewers, using more or less the same studies and all produced within a year of each other, can reach such disparate conclusions? The 3ie report faults CDD for not doing what it never claimed to do, cherry picks the qualitative literature, and proposes instead to move procurement to local governments rather than communities, despite openly saying they base this entire conclusion on just one 2004 World Bank study of four countries’ social funds. This is the kind of advice we’re giving to governments and activists in developing countries?
What is so brilliant about Brian Levy’s discussion is that he nails a lot of what is really at stake here, which is power, influence, and money. CDD’s founding principle is that for a lot of local-level development work, developing countries don’t need much of that disempowering apparatus of big consultancies, complex procurement rules, and rigidly defined logframes guiding their every move. Nor do they need to replicate donor agencies’ deep and often vicious sectoral divisions. But what lies at the heart of CDD is the importance of accountability for service delivery. CDD began as an anomaly, but by now the idea that working through partnerships and financial transfers that give more voice to local choices can be managed, and managed well, by a broad range of developing country governments is spreading to contexts where more standard approaches have simply given up. Not perfectly, and not without failures, but it is happening and it is improving. So it’s no surprise that the traditionalists have started to push back. Perhaps it is time to step back from a meta-review of developing country’s projects and instead do some meta-reviews of the reviewers.