The efficiency of modern philanthropy
Summary: The most important inefficiency in philanthropy may be the philanthropist’s desire to make decisions that look good in retrospect.
In financial markets, if you encounter an investment opportunity which looks like it will significantly outperform the market, it is reasonably safe to conclude that either (1) finding the opportunity required some special ability, info, disposition, or connection you have that others lack, or (2) you are mistaken about the goodness of the opportunity, or there are associated costs.
We might ask: to what extent is the same thing true in philanthropy? If it looks to me like a cause is obviously important, but others are ignoring it, does that mean that I’m overlooking something they know? Of course, there is a spectrum of possibilities. In general I should ask: how hard should I expect to have to search before I can expect to find something important that others have overlooked?
Holden from GiveWell has recently commented on this question. His conclusion, roughly, is that while there are probably still neglected high-impact causes, you should expect to have to do a lot of effort to identify them. In particular, one can’t appeal to simple a priori arguments about what people are likely to miss and expect to thereby find great neglected opportunities.
Without question, Holden has much more experience investigating charitable causes than I do. But I think the case for “efficient philanthropy” is relatively weak. I think the largest gap between philanthropist’s incentives and the public good is the philanthropist’s desire to make decisions that look good in retrospect, and that this gap could well cause many of the most important philanthropic interventions to be left on the table. I think the evidence Holden and others cite doesn’t bear significantly on this hypothesis.
Evidence for efficiency
Everyone agrees that the majority of philanthropic spending is not particularly efficient; our uncertainty is about the existence of a significant contingent of donors who are identifying and supporting the most effective available causes (either deliberately or incidentally). Holden cites some evidence from GiveWell’s experiences regarding this question.
- GiveWell has spent several years searching for good philanthropic opportunities in developing world aid. What they have found is that, despite the existence of many inefficient projects, the majority of high-impact interventions supported by strong evidence are in fact being pursued.
- In other areas, particularly meta-research, GiveWell’s early perception that important causes were being neglected was challenged after a closer investigation revealed that there were in fact funders in those areas.
These cases are particularly notable because they challenge the intuition that donors aren’t responsive to strong quantitative estimates of effectiveness.
Beyond GiveWell’s experiences, Holden provides some casual arguments that philanthropists are being pushed towards efficiency despite a lack of financial incentives. In particular, philanthropists do have a desire to do good, to be respected for doing good, to be satisfied with their own impact on the world, and to be seen as pioneers in neglected areas. Of course these incentives aren’t perfectly aligned with social welfare, but as long as some philanthropists are influenced by these incentives, they may cover the highest value giving opportunities.
What inefficiency should we expect?
Before inferring something about the efficiency of philanthropy from this evidence, we should have an idea of what philanthropic inefficiency would look like. So: in what ways do we expect that philanthropy might be inefficient? The basic problem is that people don’t necessarily have much inclination towards improving aggregate well-being, and instead are concerned with feeling good about their impact on the world, looking noble, resolving feelings of social responsibility, reenforcing a chosen identity, and so on.
One might further hypothesize, as Holden apparently did, that particular types of interventions are less satisfying to support and less reliably or extensively rewarded by social approbation:
- A: Interventions supported by good quantitative evidence, as compared to interventions supported by compelling narratives.
- B: Interventions which rest on unconventional assumptions, or which are merely not traditional targets for philanthropy.
These further hypotheses seem pretty likely for the average donor, but it doesn’t seem likely that they apply uniformly. It is pretty easy to imagine donors who are disposed to go out of their way to find new causes, that there are large communities with a quantitative bent which respect quantitative arguments, and so on. And looking at the world reveals some donors with those characteristics (e.g. tech entrepreneurs).
The evidence Holden cites suggests that, indeed, there is a big contingent of donors who fund interventions that fall under these two classes. I agree that this is some evidence for the efficiency of philanthropy; even if the number of donors in that contingent is small, it still rules out many a priori possible worlds in which quantitatively-minded altruists could expect to have an outsized impact. But it seems to be relatively weak evidence for the efficiency of philanthropy itself, because the prior probability of particular interventions in classes A or B being completely missed was not very high.
I think that a priori the following class of interventions is significantly more likely to be both less satisfying to support and less reliably rewarded by social approbation, and consequently is much more likely to be badly neglected:
- C: Interventions which won’t reliably look like a good idea in retrospect.
Basically all of the non-altruistic motivations for do-gooding only incentivize actions which look like they are good, either to others or to ourselves. Moreover, there are rapidly decreasing gains to looking good (and significant social and psychological costs from taking actions that might be turn out to be worthy of criticism), and so we should expect actions which reliably look good to be significantly preferred by most philanthropists.
Looking good in hindsight
What looks good in hindsight?
Suppose I buy a lottery ticket which clearly has a positive expected value. I will probably lose, but as long as the lottery didn’t turn out to be a hoax my decision still looks good in hindsight. I can easily justify my decision to myself and to others, and if others learn the details of the situation they will rightly conclude that I was wise in purchasing the ticket.
If I decide buying Bitcoins is a good deal in expectation, I will probably lose money. After the fact, my decision just looks dumb to others, and I may have a hard time even justifying it to myself. I can try to convey the arguments that I found convincing at the time, but it’s an extremely uphill battle. I need to fight against (1) hindsight bias, (2) the justified expectation that I am selectively presenting arguments in order to make myself look good, (3) the justified expectation that, given that Bitcoin failed, wiser observers would have been more likely to expect Bitcoin’s failure. Even if Bitcoin was in fact a good investment, this is not a fight I would expect to win, even against myself. The result is unsatisfying for me and has low social payoff.
Much ado has been made about the difference between an objective probability estimate, coming from well-understood physics like a coin flip, and a subjective probability estimate, coming from uncertain reasoning. Many people seem quite convinced that these two kinds of probabilities should be treated quite differently in decisions (beyond the fact that you should expect subjective probabilities to change as you get more evidence, which is the most important difference and which everyone accepts). As far as I can tell, the arguments for a relevant difference are fairly weak, although I also have conflicting intuitions and am open to more argument. It seems more likely to me that this is an effort to satisfy the desire to do things that will look good idea retrospect without recognizing looking good as an important objective.
In charity, risky interventions are probably already significantly less attractive than risk-free interventions. Donors, even those who are very quantitatively minded and EV-maximization oriented, appear to have a very strong bias towards being sure they have done good. But I think the issue of objective vs. subjective probabilities has a much larger effect. This is certainly born out by observations of altruists’ extreme squeamishness towards such subjective estimates.
Short time horizons
The most straightforward way an action can fail to look good is if the positive effects are in the future. If I try to help people in the far future, I won’t know whether I succeeded and neither will anyone who might praise me for helping. (Whether there are any good interventions of this form, that don’t similarly help current folk, is still an open question. But I am willing to bet on it, and think such interventions are some of the most likely to be overlooked.)
Often we can justify doing good for the long-term by achieving intermediate results that are widely believed to be good today, for example contemporary prosperity or positive environmental outcomes. If we think that these intermediates are reliably coupled to actual long-term outcomes (and are an exhaustive inventory of relevant intermediates) then this would eliminate the potential bias towards short-term interventions. But I think this is not very likely:
It seems obvious why the people of today think that making the people of today happier is good. Intuitions about the importance of environmental preservation seem like much more reliable signs about the future, but even the most optimistic environmentalist would have to agree that environmentalism is in large part an independent psychological and sociological phenomenon, and that there is no good reason that other similarly important causes would enjoy such a wide base of popular support.
A clear case
Many interventions are supported by narratives in which clear effects can be traced directly to the intervention. I can see the particular children I helped, the particular sustainable enterprises I helped support, and so on. Some interventions are supported by relatively clean RCTs or good natural experiments. Both of these characteristics make it easy for a donor to identify the impact and convincingly justify their claim to have made an impact (which type of evidence is more useful for seeming virtuous depends on your audience). Often the measurable effects are not good in and of themselves, but they are at least widely regarded to be positive. (E.g., if I support sustainable businesses, I have only very messy evidence about my long-term impact, but I do have clear evidence that I supported sustainable businesses, and I probably inhabit a social circle where this is widely recognized as valuable.)
I expect that many of the best interventions will be based on much less clean stories. For example, if I speed up a for-profit research project that would have been done anyway, even if I can make a good argument based on econometrics that my actions helped people a lot in expectation, it is likely to be extremely difficult for me to trace that impact precisely. If I try to claim that I had an impact I am likely to be met with claims of “that was profitable and so would have happened anyway; how do you know you helped at all?” If I do outreach for a cause which is currently unpopular, it is likely for me to be difficult to trace the impact of my outreach according to any metric other than “how many people care about X?” Nevertheless, I think such outreach is sometimes valuable.
Being part of a big reference class
Investments in start-ups will normally not make money. But the social consequences of losing money in a startup are much better than the social consequences of losing money in Bitcoin. It’s pretty well known that good investments (in certain areas) still lose money more often than not. Onlookers will still lose confidence in me when I invest in a failed startup, but their judgment will reliably be much less harsh than if I lost my money in a more exotic way. This is closely related to the above point about objective probabilities, where large reference classes play the role of objective probabilities (and indeed, normally such probabilities would be lumped with “objective probabilities” in debates about the decision-theoretic significance of different kinds of probability estimates).
Similarly, it is recognized that efforts in development have hard-to-trace consequences and that investment in development is an appropriate and noble behavior for a philanthropist. If I were to pursue a more exotic intervention which is similarly complicated and equally-good-in-expectation, I would have a hard time getting credit for that unless I pursued a project with a relatively clear payoff. Of course this leaves the question of why development aid has acquired such a reputation. But given the long history of alms as a charitable activity, completely disconnected from any estimate of the impact of development aid today, it seems like this is an independent sociological puzzle with little bearing on the suitability of aid as an intervention.
Are those really problems?
There are some plausible reasons that this purported “distortion” might actually be a good incentive, which improves the quality of philanthropy rather than degrading it. I believe this is the current view of most of the folk at GiveWell. Responding to all of the arguments for this position would definitely be beyond the scope of this post, though I think it is an important issue (and I mostly disagree).
I think by far the most important point in favor is that any project without feedback from the world is automatically doomed. Therefore in order to make real progress you must get lots of feedback, and this feedback will necessarily guarantee that the program looks good in retrospect.
This argument does not seem compelling to me. I agree that much of the oomph from any project comes from iterating with feedback. But such feedback definitely does not generally guarantee that the project will look good in retrospect.
In addition to having a component where you get feedback and improve, plans have a component where some observed effect on the world translates into an unobservable improvement in the long-run (I remarked on this here). But this component typically doesn’t involve feedback, and there is no particular reason it would look like a good idea after the fact even if it were. E.g., I could support a for-profit enterprise which eventually becomes self-sufficient and grows to scale. But determining whether that enterprise has a significant positive impact is much more complicated. Just because the impact was positive, I wouldn’t expect it to reliably look positive. As another example, I do computer science research. I get lots of feedback about whether the CS community likes my work, about whether my mathematical intuitions are valid, about whether I can contribute usefully to people working in other fields, and so on. Over time I get much better at doing research. But none of this much helps me determine how positive the long-run impact is. Even if it were a good idea, I wouldn’t necessarily expect it to look like a good idea (and conversely).
One way to help make good plans reliably look good in retrospect (and to overcome some of the problems that plague plans without feedback) is to lay out relevant arguments in detail and have extensive discussions, particularly with those who disagree. This appears to be done very little, but may be quite important both in terms of getting the answers right and in aligning philanthropists’ interests (and constraints) with social good.
So I’m hopeful that either (1) I’m mistaken about the extent to which such discussion is already occurring (in which case social conceptions of value might be more aligned with long-run value than I expect), or (2) such discussion will become more common over the coming years (in which case social conceptions of value might gradually become more aligned with long-run value than they are right now).
Robin Hanson often mentions the tension between having influence and getting credit, which is a similar issue. But note that even the actor who has influence but not credit is much better off (both at cocktail parties and when reflecting on their life) than the actor who had influence in expectation but is not sure that their reasoning was reliable or their expected influence was real.
[The irony of this postscript is not lost on the author.]
I feel like you are being a bit too quick to dismiss the evidence gathered by GiveWell, especially given that both your, my, and GiveWell’s initial intuitions were wrong. Essentially, this would be like someone thinking that strategy X would do really well on the stock market, then doing a bunch of research and finding out that they were wrong, then being similarly confident that strategy Y would do really well. I think GiveWell’s evidence should cause us to update in general towards high-impact interventions being hard to find. This is particularly true given that we were (or at least I was) wrong about this issue twice — first for third-world development, then for meta-research.
That being said, I agree with your basic conclusion that “laying out relevant arguments in detail and having extensive discussions, particularly with those who disagree” is a pretty good strategy. I would expect this to lead to approaches that were relatively well-accepted, at least by people with relevant training (where the training here is in doing high-risk research, e.g. Scott Aaronson is my prototype here for the sort of person you should expect to be able to convince).
On the other hand, “Interventions which won’t reliably look like a good idea in retrospect” seems like a poor description of this sort of approach — they are ideas which don’t currently have a compelling argument for them, but which we expect *might* have a compelling (at least to a few key people) argument given sufficient time investment. If after that time investment there turn out not to be compelling arguments, then we should probably look at different interventions instead.
I think it’s reasonable to place this in the reference class of basic research. What do things look like there? Intelligent people with lots of training develop intuition, and that intuition suggests approaches to solving problems. Often (at least in fields that aren’t as well-run as TCS) the approach you had in mind has actually already been successfully executed, but is not something you knew about until actively searching for it. Other times, it turns out that the problem that your idea solves is not as important as you originally thought (I would lump this in with the “some observed effect on the world translates into an unobservable improvement” and argue that you can often analyze the mechanism, at least imperfectly). In the remaining instances, you do preliminary experiments and actively discuss your ideas with others. The result of this is that sometimes the idea turns out not to work, and in the vast majority of the remaining cases, the original idea does not work but suggests a new idea that does in fact work (maybe sometimes the original idea works untouched, but I’ve never actually seen this happen). I expect basically the same story to play out in philanthropic interventions, and expect progress to occur as a result of people grappling with the problem until they reach an insight that no one else has yet had (or that no one else has yet successfully executed on). I expect such insights to come more readily to people who successfully acquire and mediate between more than one background area (for instance, having a strong background in both econ and sociology, or math and biology).
The story above makes your criterion C seem too universal to be true. In particular, the correct interventions are going to depend in complicated ways on the particular facts surrounding those interventions, and a criterion that doesn’t involve closely examining problem-relevant data is unlikely to fare well. The closest thing to it that does seem true is “Interventions that aren’t obvious”; I think this criterion has both much better a priori and empirical support, so am not sure why we don’t use that one instead.
It seems like we have gradually acquired evidence that philanthropy is not inefficient in a few particular ways it could have been, which is evidence in favor of efficiency. How much evidence we can get depends on how likely those particular inefficiencies were to start with. Are wonkish and wacky causes reliably neglected in 50% of worlds without efficient broad markets? 75%? Going much higher seems unlikely.
After you know the Gates foundation exists, you shouldn’t be too surprised that the low-hanging fruit in development are being funded at all (nor that they are being funded by people other than Gates), though it would be a modest further surprise to learn that they were adequately funded. So this does not seem like a huge update, and if the existence of the Gates foundation didn’t convince you that philanthropy is inefficient, neither should this.
I wouldn’t expect interventions that aren’t obvious to be systematically be neglected, except in the same sense that expensive interventions are neglected—they require a special ability to notice, and so if you have that ability you might expect to be able to notice some new ones. In the same way, if you have some money you do expect to find some good things to do with it, since there is not an infinite supply of money. We can talk about the social returns to insight as well as the social returns to money. The question is whether people might not be picking the highest-value opportunities, either with their money or with their insight (or whatever other resources they bring to bear).
Regarding my criterion C, it seems introspectively clear to me that *I* would be quite happy to donate to something with an unusual or wonkish argument in favor, but would feel much more uncomfortable donating to it if there was a 90% chance of looking predictably silly a year from now. You could reject the premise, by saying that anything that has a 90% chance of looking silly a year from now is necessarily bad. But even if that happened to be a law of nature, I wouldn’t expect my intuitions to reflect it. The casual evidence for bias in this case seems quite strong. Moreover, it seems like these arguments are independent of the complicatedness of the world.
> Many people seem quite convinced that these two kinds of probabilities should be treated quite differently in decisions (beyond the fact that you should expect subjective probabilities to change as you get more evidence, which is the most important difference and which everyone accepts). As far as I can tell, the arguments for a relevant difference are fairly weak, although I also have conflicting intuitions and am open to more argument.
What are these arguments for a relevant difference (besides the one you named)?
Mostly that we have arguments that constrain our behavior tightly for the case of objective probabilities, which are intuitively satisfying, but that similar arguments for subjective uncertainty seem to be weaker. They still seem pretty tight to me, so like I said I don’t find the difference compelling.
Holden’s view is that the important difference is that the views of experts should influence your subjective probabilities (perhaps in a complicated way, e.g. the evidence “no one is doing this yet” suggests that some step of your argument is wrong) whereas if you have objective probabilities typically they screen off others’ views. I was counting this as a subset of “your subjective probabilities change when you learn more,” but probably it deserves to be called out separately.