by paulfchristiano

Some disasters (catastrophic climate change, high-energy physics surprises) are so serious that even a small probability (say 1%) of such a disaster would have significant policy implications. Unfortunately, making predictions about such unlikely events is extremely unreliable. This makes it difficult to formally justify assigning such disasters probabilities low enough to be compatible with an intuitive policy response. So we must either reconsider our formal analyses or reconsider our intuitive responses.

Intuitively, even if we don’t have an explicit model for a system, we can reason about it inductively, relying on generalizations from historical data. Indeed, this is necessary for virtually all everyday reasoning. But we need to be much more careful about inductive reasoning if we want to use it to obtain 99%+ confidence. In practice such reasoning tends to hinge on questions like “How much should we trust the historical trend X, given that we today face unprecedented condition Y?”

For example, we might wonder: “should we confidently expect historically amiable conditions on Earth to continue in the face of historically unprecedented environmental disruptions?” Human activity is clearly unprecedented in a certain sense, but so is every event. The climate of Earth has changed many times, and probably never undergone the kind of catastrophic climate shift that could destroy human civilization. Should we infer that one more change, at the hands of humans, is similarly unlikely to cause catastrophe? Are human effects so different that we can’t reason inductively, and must resort to our formal models?

Even when we aren’t aiming for confident judgments, it can be tricky to apply induction to very unfamiliar domains. Arguments hinging on inductive reasoning often end up mired in disputes about what “reference class” is appropriate in a given setting. Should we put a future war in the (huge) reference class “wars” and confidently predict a low probability of extinction? Should we put a future war in a very small reference class of “wars between nuclear powers” and rely on analytic reasoning to understand whether extinction is likely or not? Both of these approaches seem problematic: clearly a war today is much more dangerous than a war in 1500, but at the same time historical wars do provide important evidence for reasoning about future wars. Which properties of future war should we expect to be contiguous with historical experience? It is easy to talk at length about this question, but it’s not clear what constitutes a compelling argument or what evidence is relevant.

In this post I want to take a stab at this problem. Looking over this post in retrospect, I fear I haven’t introduced anything new and the reader will come away disappointed. Nevertheless, I very often see confused discussions of induction, not only in popular discourse (which seems inevitable) but also amongst relatively clever altruists. So: if you think that choosing reference classes is straightforward and uncontroversial then please skip this post. Otherwise, it might be worth noting–before hindsight kicks in–that there is a confusion to be resolved. I’m also going to end up covering some basic facts about statistical reasoning; sorry about that. 

(If you want you can skip to the ‘Reasoning’ section below to see an example of what I’m justifying, before I justify it.)

Applying Occam’s Razor

I’ll start from the principle that simple generalizations are more likely to be correct—whether they are causal explanations (positing the physics governing atoms to explain our observation of atomic spectra) or logical generalizations (positing the generalization that game of life configurations typically break down into still life, oscillators, and gliders). This is true even though each instance of a logical generalization could, given infinite resources, be predicted in advance from first principles. Such a generalization can nevertheless be useful to an agent without infinite resources, and so logical generalizations should be proposed, considered, and accepted by such agents. I’ll call generalizations of both types hypotheses.

So given some data to be explained, I suggest (non-controversially) that we survey the space of possible hypotheses,  and select a simple set that explains our observations (assigns them high probability) within the limits of our reasoning power. By “within the limits of our reasoning power” means that we treat something as uncertain whenever we can’t figure it out, even if we could predict it in principle. I also suggest that we accept all of the logical implications of hypotheses we accept, and rule out those hypotheses which are internally incoherent or inconsistent with our observations.

We face a tradeoff between the complexity of hypotheses and their explanatory power. This is a standard problem, which is resolved by Bayesian reasoning or any other contemporary statistical paradigm. The obvious approach is to choose a prior probability for each hypothesis, and then to accept a hypothesis which maximizes the product of its prior probability and its likelihood—the probability it assigns to our observations. A natural prior is to give a prior of complexity K a prior probability of exp(-K). This basically corresponds to the probability that a monkey would type that hypothesis by chance alone in some particular reference language. This prior probability depends on the language used to define complexity (the language in which the monkey types).

So given some data, to determine the relative probability of two competing hypotheses, we start from the ratio of their prior probabilities, and then multiply by the ratio of their likelihoods. If we restrict to hypotheses which make predictions “within our means”—if we treat the result of a computation as uncertain when we can’t actually compute it—then this calculation is tractable for any particular pair of hypotheses.


The above section described how the probability of two proposed hypotheses might be compared. That leaves only the problem of identifying the most likely hypotheses. Here I would like to sidestep that problem by talking about frameworks for arguing, rather than frameworks for reasoning.

In fact the world is already full of humans, whose creativity much exceeds any mechanical procedure I could specify (for now). What I am interested are techniques for using those reasoning powers as an “untrusted” black box to determine what is true. I can trust my brain to come up with some good hypotheses, but I don’t trust my brain to make sensible judgments about which of those hypotheses are actually true. When we consider groups rather than individuals this situation becomes more extreme—I often trust someone to come up with a good hypothesis, but I don’t have any plausible approach for combining everyone’s opinion to come up with a reasonable judgment about what is actually true. So, while it’s not as awesome as a mechanical method for determining is what is true, I’m happy to settle for a mechanical method for determining who is right in an argument (or at least righter).

Language dependence

The procedure I specified above is nearly mechanical, but is also language-dependent—it will give different answers when applied with different languages, in which different hypotheses look natural. It is plausible that, for example, the climate skeptic and the environmentalist disagree in part because they think about the world in terms of a different set of concepts. A hypothesis that is natural to one of them might be quite foreign to the other, and might be assigned a correspondingly lower prior probability.

Casually, humans articulate hypotheses in a language that contains simple (and relatively uncontroversial) logical/mathematical structure together with a very rich set of concepts. Those concepts come from a combination of biologically enshrined intuitions, individual experiences and learning, and cultural accumulation. People seem tomostly agree about logic and mathematics, and about the abstract reasoning that their concepts “live in.”

One (unworkable) approach to eliminating this language dependence is to work with that simple abstract language without any uniquely human concepts. We can then build up more complicated concepts as part of the hypotheses that depend on those concepts. Given enough data about the world, the extra complexity necessary to accommodate such concepts is much more than counterbalanced by their predictive power. For example, even if we start out with a language that doesn’t contain “want,” the notion of preferences pulls a lot of predictive weight compared to how complicated it is.

The reason this approach is unworkable is that the network of concepts we use is too complicated for us to make explicit or manipulate formally, and the data we are drawing on (and the logical relationships amongst those data) are too complicated to exhaustively characterize. If a serious argument begins by trying to establish that “human” is a natural category to use when talking about the world, it is probably doomed. In light of this, I recommend a more ad hoc approach.

When two people disagree about the relative complexity of two hypotheses, it must be because that hypothesis is simpler in one of their languages than in the other. In light of the above characterization, this disagreement can be attributed to the appearance of at least one concept which one of them thinks is simple but the other thinks is complex. In such cases, it seems appropriate to pass to the meta level and engage in discussion about the complexity of that concept.

In this meta-level argument, the idealized framework—in which we resort to a language of simple connectives and logical operations, uninformed by human experience, and accept a concept when the explanatory power of that concept exceeds its complexity—can serve as a guideline. We can discuss what this idealized process would recommend and accept that recommendation, even though the idealized process is too complicated to actually carry out.

(Of course, even passing to a simple abstract language does not totally eliminate the problem of language dependence. However it does minimize it, since the divergence between the prior probabilities assigned using two different languages is bounded by the complexity of the procedure for translating between them. For simple languages, in an appropriate sense, the associated translation procedures are not complicated. Therefore the divergence in the associated prior probability judgments is small. We only obtain large divergences when we pass to informal human language, which has accumulated an impressive stock of complexity. There is also the empirical observation that people mostly accept common abstract languages. It is possible that someone could end an argument by declaring “yes, if we accept your formalization of logic then your conclusion follows, but I don’t.” But I’ve yet to see that failure mode between two serious thinkers.)


I’ve described a framework for using induction in arguments; now I’d like to look at a few (very similar) examples to try and illustrate the kind of reasoning that is entailed by this framework.

Will project X succeed?

Suppose that X is an ambitious project whose success would cause some historically unprecedented event E (e.g. the resolution of some new technical problem, perhaps human-level machine intelligence). The skeptic wants to argue “no one has done this before; why are you so special?” What does that argument look like, in the framework I’ve proposed?

The skeptic cites the observation that E has not happened historically, and proposes the hypothesis “E will never happen,” which explains the otherwise surprising data and is fairly simple (it gets all of its complexity from the detailed description of exactly what never happens—if E has a natural explanation then it will not be complicated).

The optimist then has a few possible responses:

  1. The optimist can “explain away” this supporting evidence by providing a more probable explanation for the observation that E hasn’t yet happened. This explanation is unlikely to be simpler than the flat-out denial “E will never happen,” but it might nevertheless be more probable if it is supported by its own evidence. For example, the optimist might suggest “no one in the past has wanted to do E,” together with “E is unlikely to happen unless someone tries to make it happen.” Or the optimistic might argue “E was technologically impossible for almost all of history.”
  2. The optimist can provide a sufficiently strong argument for their own success that they overwhelm the prior probability gap between “E will never happen” and “E will first happen in 2013” (or one of the other hypotheses the optimist suggested in [1]).
  3. The optimist can argue that E is very likely to happen, so that “E will never happen” is very improbable. This will push the skeptic to propose a different hypothesis like “E is very unlikely each year” or “E won’t happen for a while.” If the optimist can undermine these hypotheses then the ball is in the skeptic’s court again. (But note that the skeptic can’t say “you haven’t given any reason why E is unlikely.”)
  4. The optimist can argue that “E will never happen” is actually a fairly complex hypothesis, because E itself is a complex event (or its apparent simplicity is illusory). The skeptic would then reply by either defending the simplicity of E or offering an alternative generalization, for example showing that E is a special case of a simpler event E’ which has also never occurred, or so on.
  5. Note: the optimist cannot simply say “Project X has novel characteristic C, and characteristic C seems like it should be useful;” this does not itself weaken the inductive argument, at least not if we accept the framework given in this post. The optimist would have to fit this argument into one of the above frameworks, for example by arguing that “E won’t happen unless there is a project with characteristic C” as an alternative explanation for the historical record of non-E.

Of course, even if the optimist successfully disarms the inductive argument against project X’s success, there will still be many object level considerations to wade through.

Will the development of technology X lead to unprecedented catastrophe Z?

Suppose that I am concerned about the development of technology X because of the apparent risk of catastrophe Z, which would cause unprecedented damage. In light of that concern I suggest that technology X be developed cautiously. A skeptic might say “society has survived for many years without catastrophe Z. Why should it happen now?” This argument is structurally very similar to the argument above, but I want to go through another example to make it more clear.

The skeptic can point to the fact that Z hasn’t happened so far, and argue that the generalization “Z is unlikely to happen in any particular year” explains these observations, shifting the burden of proof to the doomsayer. The doomsayer may retort “the advent of technology X is the first time that catastrophe Z has been technologically feasible” (as above), thereby attempting to explain away the skeptic’s evidence. This fits into category [1] above. Now the argument can go in a few directions:

  1. Suppose it is clear ex ante that no previous technologies could not have caused catastrophe Z, but only because we looked exhaustively at each previous technology and seen that it turns out that those technologies couldn’t have caused catastrophe Z. Then the generalization “Z is unlikely” still makes predictions—about the properties that technologies have had. So the doomsayer is not clear yet, but may be able to suggest some more likely explanations, e.g. “no previous technologies have created high energy densities” + “without high energy densities catastrophe Z is impossible.” This explains all of the observations equally well, and it may be that “no previous technologies have created high energy densities” is more likely a priori (because it follows from other facts about historical technologies which are necessary to explain other observations).
  2. If many possible technologies haven’t actually been developed (though they have been imagined), then “Z is unlikely” is also making predictions. Namely, it is predicting that some imagined technologies haven’t been developed. The doomsayer must therefore explain not only why past technologies have not caused catastrophe Z, but why past imagined technologies that could have caused catastrophe Z did not come to pass. How hard this is depends on how many other technologies looked like they might have been developed but did not (if no other Z-causing technologies have ever looked like they might be developed soon, then that observation also requires explanation. Once we’ve explained that, the hypotheses “Z is unlikely” is not doing extra predictive work).
  3. In response to [1] or [2], the skeptic can reject the doomsayer’s argument that technology X might cause catastrophe Z, but historical technologies couldn’t have. In fact the skeptic doesn’t have to completely discredit those arguments, he just needs to show that they are sufficiently uncertain that “Z is unlikely” is making a useful prediction (namely that those uncertain arguments actually worked every time, and Z never happened).
  4. In response to the skeptic’s objections, the doomsayer could also argue give up on arguing that there have been no historical points where catastrophe Z might have ensued, and instead argue that there are only a few historical points where Z might have happened, and thus only a few assumptions necessary to explain how Z never happened historically.
  5. Note: the doomsayer cannot simply say “Technology X has novel characteristic C, and characteristic C might precipitate disaster Z;” this in itself does not weaken the inductive argument. (The doomsayer can make this argument, but has to overcome the inductive argument, however strong it was.)