The golden rule
by paulfchristiano
Most of my moral intuitions are well-encapsulated by the maxim: “do unto others as you would have them do unto you.” This is a principle which has extremely broad intuitive appeal, and so it seems worth exploring how I end up with a relatively unusual ethical perspective.
I think there are at least four ways in which my moral views are somewhat unusual:
- I am a committed consequentialist.
- I think that helping twice as many people is roughly twice as good, and I am unusually quantitative and hard-nosed about altruism.
- I think that the welfare of future people matters about as much as the welfare of existing people.
- I think that making happy people is extremely valuable, and would be deeply saddened by a future in which the population was much smaller than it could be.
I think that there are a few modest differences in my understanding of the golden rule that lead to these practical differences:
- I think about helping twice as many people as if I were helping each person with twice the probability. (Note that from behind a veil of ignorance, if twice as many people are helped I do have twice the probability of being helped.)
- I am very happy to exist. I would accept relatively large decreases in welfare, in exchange for a higher probability of existing.
Consequentialism
When people are considering several policies, some of which might help me, I would really prefer they do the one that helps me the most. I don’t care if they are virtuous, or if they violate deontological side constraints, or if they act according to a maxim that could be universalized, or whatever (except insofar as those things bear on how effectively they help me). I just want to get the things that I want. So when I consider helping other people, I think I should just help them as effectively as possible, rather than being virtuous, satisfying side constraints, acting according to universalizable maxims, etc.
(I should flag some further subtleties here concerning meddling preferences, but they don’t change the conclusion and I don’t want to get bogged down.)
Aggregation
As a consequence of [1], when I think about giving X to two people or giving Y to one person, I try to think about whether I would prefer receive X with probability 2% or Y with probability 1%. I think that this is a relatively uncommon perspective. My best guess is that the difference is mostly a manifestation of my very quantitative attitude, rather than serious philosophical disagreements.
The future
I consider helping people roughly equally good whether they live now or in the future. I don’t like the version of the golden rule that says “Do unto others [who live nearby] as you would have them do unto you,” and I don’t like the version that says “Do unto others [who are your contemporaries] as you would have them do unto you” either. And I definitely don’t like the version that says “Do unto others [who might do back] as you would have them do unto you.”
Though it may be question-begging, the reason I don’t like these principles is that they violate the golden rule: I want people in history to try to help me live a good life. When I think about people helping me I’m not going to say “this food is tasty but I really wish that it had been secured by the good will of my neighbor rather than someone living 50 years ago and 10,000 miles away.” There is a further thing I find valuable in the goodwill of neighbors and the vitality of a community, but that’s just another good that should be weighted up in the calculus. And given the kinds of tradeoffs that are actually available, I tend to think that goods like health and prosperity are just a lot more important than the advantages of being able to experience gratitude in person rather than owing it to someone who lived long ago or far away; our ability to help those who live far away or in the future appears to be so much greater than our ability to help those around us (at least for readers living in the rich world of the 21st century).
Creating people
As a consequence of [3], when I think about changes that would bring new people into the world with good lives, I tend to think that those changes are valuable. If I imagine someone deciding whether they should take an action that would bring me into existence, I very much want them to take it. In combination with [1], when I think about the comparison between 20 billion people existing under conditions X or 10 billion people existing under conditions Y, I try to think about whether I would prefer exist under conditions X with probability 2% or under conditions Y with probability 1%. (Carl Shulman has written a post that takes this idea to one logical extreme.)
The kinds of tradeoffs that I find myself considering in my own do-gooding seem to be modest losses in quality of life for existing people in exchange for a small improvement in the probability of many billions of billions of people existing and having rich and valuable lives. Scaled up, I think the tradeoffs look something like making existing people 10% poorer in exchange for increasing by 0.1% the probability of a prosperous and very (very) large future. On the kind of calculus outlined above, I am quite happy with this tradeoff.
While this view is still related to my very quantitative outlook, I think disagreement about this view is much more likely to be due to philosophical disagreements.
One common objection to this view is “if no one exists, no one will be sad about their non-existence.” I consider this objection very weak. If someone does exist, someone will be happy about their existence. So even if non-existence is not bad, existence is good.
A problem I consider more serious is that this view seems to be very sensitive to the outlook of possible people regarding existence. I would be happy to make my life significantly worse in exchange for a higher probability of existing, but many people feel differently. Moreover, this difference seems to be more about a difference in outlook, rather than a difference in quality of life. This seems particularly problematic given that for nearly any outlook, we can imagine possible people with that outlook. Should we be happy to create any possible people who are enthusiastic about existing, and unhappy to create any possible people who aren’t? What if the actual character of their experiences were almost the same, and only their attitudes about non-existence differed?
For example, regardless of the actual content of individuals’ experiences, we might expect natural selection to produce people who would prefer to exist. So if we think there is any kind of experience that would be a net negative, but which could have been produced by natural selection, then we might run into a conflict.
I don’t want to get into a long discussion here, but my own best guess is that if someone would prefer existing than to not existing (knowing all of the facts, reflecting appropriately, and so on), then all else equal I would prefer that person to exist. These intuitions become weaker as we move further away from the kinds of minds I am familiar with, and aren’t that strong to begin with, but I’ll postpone a longer discussion for some future day.
Utilitarianism
Incidentally, though I am a committed consequentialist I am not a hedonistic utilitarian. This perspective sometimes puts me at odds with a few acquaintances. I haven’t listed this view above because I think that in the world at large, hedonistic utilitarianism is extremely unusual; however, it is another moral issue where I feel like my view is driven by the golden rule. I have preferences for things other than my own experiences of pleasure and pain, and even if I did not I can recognize that if I had other preferences, I would prefer that others respect those preferences.
So what version of utilitarianism do you subscribe to?
Nothing concise.
When you consider a mind’s preference to exist rather than not exist you are assigning moral weight to the preferences of minds that do not exist and may never exist. This seems like a dangerous policy as mindspace consists almost exclusively of things that will never exist and have preferences entirely orthogonal to human preferences. It seems like a poor idea to dedicate oneself to serving the interests of such theoretical entities.
I agree there are many problems lurking here. Despite what I’ve said in this post, I don’t have much respect for preferences *per se*. I also don’t think that I should respect someone’s preferences more because they exist, since that both feels capricious and it’s not what I want for myself (I’m just as happy getting my way in worlds where I exist as worlds where I don’t exist).
The thing I was pointing to here was using a reasoner’s preferences about its own mental states as one input into understanding which of those mental states are valuable and which are not valuable. I still endorse maximizing goodness, and have an aggregative view of goodness (i.e. maximizing the total goodness of all of the stuff according to some scheme for aggregating; see my response to Jeremy). And I definitely (1) don’t think preference satisfaction is in itself good, (2) don’t think that we should take some kind of majority vote amongst preferences to determine what is good. Instead accept that I have some potentially idiosyncratic view of what is good, though one shaped by intuitions about impartiality.
The comment at the very end of the post gestures towards an even bigger expanse of confusion; I would expect my views to shift in these areas. The prior sections in the post feel more settled internally, though of course they could also shift.
Part of the issue is the distinction between generalizing at the meta level “Of course the thing I would prefer that other people do unto me is the thing that I prefer” and the slightly less meta “The thing I prefer that other people do unto me is help me enjoy a rich and happy life,” and a range of more subtle and more plausible generalizations. I don’t have strong views on these issues, and definitely don’t endorse a precise enough conception of the golden rule that I could use it as an ethical framework on its own, it’s just an intuition that shapes some of my views.
OK. I’m a little confused now. If you are not optimizing preferences, how do you judge how much good you are doing for something that does not exist? It has nothing to say about the world but have theoretical preferences.
Also, I realize that versions of utilitarianism tend to have problems with being overly demanding, so maybe this isn’t so much of a potential objection but… how many children do you intend on having? I mean if you really think that there is a moral obligation to produce more happy people.
Finally, actually you are not as happy getting your way in worlds where you do not exist. In world where you do not exist, you are not happy at all because you do not exist.
I don’t normally think about good for things that don’t exist. I’m happy when I bring someone into being for bringing something good into being, and I use their judgment to inform my view of how good it is for them to exist.
I don’t think that the direct effect of me having kids (via the existence of some extra people) is very meaningful compared to the indirect effects (via changing the overall fibre of society, and thereby the prospects for our future). These indirect effects seem good but not comparable to other things I could do.
The basic issue is that having kids now doesn’t seem to do much to increase the total number of people who get to live, which is mostly driven by the availability of astronomical resources in the future.
But even if I just wanted to boost aggregate welfare over the coming decades, me having kids doesn’t seem super cost-effective compared to other approaches to making people’s lives better (or boosting population). I’m happy to grant that under appropriate conditions having children would be a virtuous thing to do, by giving those children a chance to live.
How can you say that you don’t care about the good of things that do not exist? You’re entire system is predicated on the idea that there is an obligation to assist entities that do not currently exist (and quite possibly will never exist) by acting in a way that will cause them to come into existence.
You say “I would be happy to make my life significantly worse in exchange for a higher probability of existing”. This seems to clearly indicate an appeal to the preferences of a potentially non-existent platonic you-as-a-computation.
I am a bit confused about how to make mathematical sense of your preferences. Suppose there are an infinite number of people distributed across space-time. I run into many classic problems with infinity if I try to spread my utility measure equally between them.
The most mathematically (if far from the most ethically) natural way to solve this problem is by caring less about people far from myself. This includes far in space, far in time, and (as related to Daniel Kane’s comment) far in mindspace.
Jeremy, I wasn’t actually proposing that we discount based on difference in mindspace, merely that we discount based on non-existence. Also the infinities aren’t really a problem unless you assume that not only are there infinitely many minds, but infinitely many minds that your actions will affect in predictable ways, a much less controversial assumption. On the other hand, I wouldn’t like the theoretical underpinnings of my moral theory to have to depend on such an assumption that is not necessarily true.
Yes I understood your original objection and proposed solution. I meant only that accounting for distance in mindspace addresses the same objection with a different solution. It is presumably reasonable to follow Paul’s original intent of valuing non-existent “people” as long as those “people” value similar things.
I am not convinced that there are no problems with infinities. I am wholly unconvinced that there will ever be a finite number of people I can honestly split off as “predictably affected.” Surely I will have some probabilistic beliefs about the great mass of people that I am not 100% sure I will affect.
Let me try to understand your “discounting based on non-existence” proposal concretely. Say I have a finite set of actions in front of me, and each of those actions leads different beings to either exist or not exist in the future. Is the idea simply that I should not consider the preferences of any of those possibly existent beings when I rank my actions? What if there is a being that exists in all futures but does not exist yet? Is it legal to consider its preferences?
I do not claim to have a consistent working theory that handles these cases. I do know that considering beings that may not exist to have equal moral weight as those that do leads to conclusions that run very contrary to my moral intuitions. On the other hand, I have not yet figured out how to formalize these intuitions in ways that do not lead to other pathologies (for example behaving in an inconsistent manner over time as your moral goals change every time a new mind is brought into existence).
In this post I tried to focus on distinctions between my view and ordinary views, but I do have a more complicated and more coherent view behind the scenes. It’s still just a current best guess, that plays the same role as “sum up over all people” and is equally likely to be overturned on further reflection.
Basically, in the same way that I think that (something like) Solomonoff induction describes my idealized anticipations about what will happen to me, I use (something like) a universal measure to distribute moral weight across valuable stuff in the universe (e.g. observer-moments).
That is, I consider all of the possible ways of picking out things of possible value (e.g. observer-moments), perhaps represented by computer programs in some language or else in some other equally flexible but hopefully more natural way. Then I mix them all together, using something like simplicity (though anything sensible would do just as well at matching my intuitions). This ensures that I’m only spreading around 1 unit of caring.
On this view, there isn’t an ontologically fundamental universe that we occupy; instead the “universe” is just a correlated part of the explanations for many different things of moral value. That is, you can point to my experience by saying “Consider the solution to these laws of physics, and look [here]” and you can point to your experience by saying “Consider the solution to these laws of physics, and look [there],” and both of our experiences acquire moral significance by virtue of these descriptions. Across different descriptions we roughly weight by 2^{-complexity} (and a person’s moral weight, like their anticipations under SI, will be dominated by the simplest explanation that points to them). For people around today, [here] and [there] are roughly equally complicated. So I end up caring about our experiences equally for the same reason that a priori I would think that I have an equal probability of being either one of us.
In very big universes, there is some divergence. For example, the Boltzmann brains get almost no moral significance under this view; specifying a Boltzmann brain by saying “Look 238723948712049812398734.23987234912098” years in the future (and specifying the coordinates, etc.) has just as much complexity as directly specifying their state. And this is again precisely analogous to the observation that an agent using Solomonoff induction doesn’t expect themselves to be a Boltzmann brain. Similarly, interpreting a waterfall as an agent doesn’t give it any moral significance, because the interpretation is so complicated.
I think there are a lot of problematic cases this view avoids, though it has some counterintuitive consequences of its own. I think the main justification goes through the golden rule: there are some arguments that convince me that I should use something like SI to define anticipations, and I think that if I use SI for anticipations then that suggests that it is the right thing to use for moral aggregation.
I’m confused about how this is supposed to work. You seem to be assigning value to the outputs of simple computer programs. How is this even remotely supposed to add up to anything remotely resembling common moral intuition?
The objects of moral value are the same, the question is how you aggregate across them. In conditions where all of the people involved relate to the universe in roughly the same way, this just reduces to counting measure. In other cases, it reproduces intuitions like “I’m probably not a Boltzmann brain / a waterfall brain / a simulation / etc.” in the form of “I don’t care that much about the Boltzmann brains / waterfalls / simulations in aggregate.”
Not caring (much) about a contorted interpretation of a waterfall as being a mind makes sense, but I don’t understand why we wouldn’t care about Boltzmann brains or simulations, at least for the few steps during which they exist before disappearing.
Paul, specifying a person is notably more difficult than you claim given that physics is not classical. You would also need to specify which part of the wave function you are consider, or more intuitively, specify the outcomes of an extraordinary number of observations.
Furthermore, if you use a complexity measure like this, I suspect that all universes will quite possibly have roughly the same total mass of people, thus making the number of people irrelevant in the final accounting.
[This would be the case if the most efficient way to specify a general person was to specify a Turing machine that enumerates all people in the universe in some order and then specifies which numbered person is being considered]
I understand that identifying observers is non-trivial in a messy universe. I think the judgments are still sensible to roughly the same extent as solomonoff induction, which I’m pretty OK with. The messiness of the universe also tends to make me happier about this view, as compared to alternatives which take “which people exist” as primitive.
Clearly universes won’t have the same total mass of people, since there are lots of ways to point to things in universes, and there is variety about what fraction of them point to people. The resulting account looks kind of like a mix of total (normalized by size of universe) and average views, along with a bunch of other theories. We can talk about what the mix ends up being (I think I expect a much higher share for the total view than you do, in part from having considered this idiosyncratic proposal at more length). One angle at looking at this is that Carl’s recent point (http://reflectivedisequilibrium.blogspot.com/2014/08/population-ethics-and-inaccessible.html) applies at many scales; across history and across different locations in a large universe, as he points out, but also across larger ensembles than we normally think of as physically existing. Though if you care about aggregate welfare for this reason then you might end up with a very high critical level (related to the average quality of life of inaccessible populations), which is a view that I consider plausible though not particularly intuitive.
Unsurprisingly, this ends up being roughly the same question as what you think using solomonoff induction implies for anthropic reasoning. If you are unhappy with the anthropic reasoning implicitly used by solomonoff induction, then you may want to adjust solomonoff induction as a principle for defining your experiences, and likewise you might want to adjust your moral theory in a similar way.
(I concede that these issues loom larger for the universal prior as an allocator of moral value than as a predictive theory. But I do think that to the extent that we dislike solomonoff induction as an allocator of moral value we should also reject it as a prior over experiences–it’s just that in the latter case it may be a more acceptable approximation.)
“Clearly universes won’t have the same total mass of people”. Would you mind providing further justification for this? As I said, if it turns out that the most efficient way to specify a person is to specify the universe (simply) and specify what it means to be a person (also relatively simply) and then specify which numbered person you are referring to, the total mass of people will depend only on the complexity of the universe (which comes out as your Kolmogorov prior) times a constant depending on the complexity of your description of what constitutes a person. You seem to have built a moral theory based on the premise that this must be false.
It’s only clear that they don’t have exactly the same mass, I agree that it’s quite plausible that they have almost exactly the same mass (and it’s also very plausible that the answer is sensitive to details of the specification. In particular, do you have to include an experience-decoder in this program anyway? If so maybe it is cheaper to write a program that scans for experiences). This is just because there are many programs in the mixture, and I’m intending to add up their contributions rather than taking max.
If I learned that this account put most of its weight on averaging, it would make me simultaneously somewhat more sympathetic to average-like views and much less sympathetic to this account. Though also note the other issues raised in the previous post. At a minimum, this is another source of large inaccessible populations. So the biggest practical difference would be the introduction of a (potentially surprisingly high) critical level.
Well unless we are the near-unique intelligences in a vast universe…
In any case, why do you seem so convinced that this is not the case, as major parts of your theory seem predicated on it being wrong.
@Paul: “But I do think that to the extent that we dislike solomonoff induction as an allocator of moral value we should also reject it as a prior over experiences”
Could you explain why?
@Daniel: I don’t understand the argument for a constant mass of people. Two universes could have the same K-complexity, but one universe contains tons of people and the other contains none (except Boltzmann brains, etc.). Maybe you mean that the total _value_ of all people is constant because, e.g., if there are twice as many people, you need an extra bit to specify any given one, so each one’s moral value is halved?
Yeah. By “mass” I meant value-mass not actual mass.
You may have persuaded me of two points. One is that the level of preferences accounts better for entities that do not exist than the level of happiness, and that the fact that I Value Identity and Complexity as well as Happiness doesn’t imply I should call myself a VICHutilitarian. I value the preferences of counterfactual Diego’s with different preferences than mine, so I’m just a same old preference utilitarian, apparently.
The second point depends on your response to Daniel Kane’s question. There’s an infinitude of mental architectures, structures and experiences which will never exist. The total quantity of mindspace structure that will not be accessed is infinite. No matter how many preferences you satisfy by bringing people into existence, the preferences of those who didn’t get to exist will not change by subtracting a finite number from an infinite one. Even if mindspace is finite, infinitely many tokens of each mind type will not have their preferences satisfied.
How do you deal with that?
If non-existence isn’t bad, and using your intuition pump, the choice that is given to these minds is “You are given a choice between a 0% probability of existing and a 0% probability of existing”, there is no reason to go one way or the other.
Thus seems to me that only once you’ve made the sublime cut into the infinitesimal existent entities, than the judgements you make about your counterfactuals start making sense (you have finitely many relevant counterfactual Paul’s to inquiry about 1 or 2% probabilities of existing).
You could say that minds are not points in mindspace, and that only types not tokens matter. Minds, you’d add are instead ranges, and then the probabilities would not be 0, for any range, of how likely it is to exist. For any range/mind though, there might be a subset or superset of that range which is still a mind that has slightly difference preferences. Which range gets to make the probability choices? It becomes a matter of reference class politics. If each point gets a vote on to what mind it wants to “belong” to, different alliances would have different probabilities of coinciding with something that in fact comes into existence. And we are back to playing reference class tennis.
I’m happy to say there are only countably many distinguishable and morally relevant entities, and I’m also fine enforcing an independence assumption and scaling down to infinitesimal probabilities or taking limits or whatever. I’m also only using this instrument as an intuition pump, and am not very troubled by a technical obstruction unless it represents a conceptually important obstruction as well.
Carl makes this point as well in the linked blog post. I find that point somewhat persuasive, but per my response to Daniel I view an agent’s preferences mostly as info about what facts about that agent would be good to be true rather than about the goodness of states not concerning that agent.
“One common objection to this view is ‘if no one exists, no one will be sad about their non-existence.’ I consider this objection very weak.”
I suspect a nontrivial fraction of the world would agree with the statement you find weak. I personally agree with it. Almost everyone outside EA circles has the intuition that it’s not regretful when someone fails to create a happy person.
I think the claim is fine (indeed no one will be sad, after we are all extinct, about our extinction), but it seems like a non-sequitir.
If no one is sad for X not to happen, but someone is happy for X to happen, I think that means X is good. Certainly in conventional cases (e.g. I am considering providing someone with a pleasant surprise) I think this is widely accepted. And if I am choosing for myself between having an awesome year and then being erased from history + disappearing, or just being erased from history + disappearing, it’s obvious which one I prefer even though I won’t regret the choice either way.
We may debate whether not doing X is blameworthy/bad/whatever; I don’t much care, but I do think that doing X is praiseworthy/good/whatever, and certainly I’m inclined to do X in such cases.
If you wanted to make this objection feel more substantive to me, you would need to do more to distinguish it from superficially similar arguments in simple cases.
Fair enough. 🙂 I agree that your reply accounts for how most people feel about the situation.
I personally feel that creating good things is vastly less important than preventing bad things, but this view is less widely believed by the world in general.
I can see where you are coming from on this view, though I don’t share it, and hopefully overall the world will reach a compromise where we are very careful to avoid really bad things happening. But it does seem substantively different from the “no one will regret it” view.
How do you avoid the repugnant conclusion?
I feel comfortable biting that bullet in many cases. That is, I grant that there are conditions under which I would exchange the existence of a large number of people living excellent lives for a sufficiently great number of lower animals living lives with small joys and few harms.
That said, I may be using “exchange” in an unusually broad sense, and my answer is (weakly) sensitive to parts of the situation that are typically elided in the repugnant conclusion thought experiment. See also my response to Jeremy’s question.
It is worth remembering that the repugnant conclusion is only a theoretical objection.
There is no practical reason why an extremely resource-efficient population couldn’t have extremely high average SWB (e.g. through various forms of hedonic enhancement or wireheading).
If you are willing to use actually efficient SWB technology, the repugnant conclusion is luckily avoided.
Is your decided non-utilitarianism a recent development? I remember you describing yourself as leaning towards utilitarianism, though maybe you didn’t mean specifically the hedonistic kind.
I mostly take the utilitarian line on most questions, and would usually describe myself either as utilitarian or mostly utilitarian depending on how much nuance the context called for.
[…] innocuous (indeed, admirable). That is, these views fail a certain kind of golden rule test (see Christiano (2014) for some interesting discussion): the person on the “receiving end” of your altruism […]