Alignment and The Messiah Bias

Sep 19, 2023 • Desmond Grealy • 4 min

There is a cognitive bias threatening the field of AI Alignment.

It is often found where high estimates of X-risk occur.

The Messiah Bias

When estimating X-risk or AGI timelines, there is a bias toward making estimates that are high enough to allow one to play grandiose roles. Specifically, the estimator is attracted to roles like:

Having "crucial" knowledge about the supposed fate of the world
Dictating what people should and shouldn't do
Dictating who should have access to knowledge and resources
Playing humanity's savior

Identifying the bias

When a researcher or organization is making extreme X-risk estimates and also moving to realize the positions listed above, it's a strong indication that some bias is present. We should be asking, "who benefits?"

An irrational bias will be built on poor foundations. If a claim of extreme risk isn't properly rationalized, some hand-waving will be encountered when moving to pinpoint certain reasoning for it. Reasoning which is hypothetical at best will often be presented as certain. There will be much use of the appeal to probability (i.e., assuming everything which can go wrong, will go wrong), and the conversation will often tend to devolve into some circular reasoning involving an appeal to authority:

I know X-risk is high and AGI timelines are short because of knowledge only I or people like me have. Take my word for it. If you do have advanced knowledge and estimate things differently, you just aren't understanding it as well as I am.

A statement like this is actually reminiscent of some other material which deals with "cosmic stakes" — it resembles those who claim to provide a direct (but of course mysterious) connection to divine knowledge which is only comprehensible to believers. Both make extraordinary claims, and both ultimately degrade to an appeal to authority when pressed.

Another circular and frequently alluded to argument deals with the extremeness of the risk estimation itself:

Imagine the risk of being wrong --it's total annihilation.
You can't afford to not take my stance on X-risk.

This is like the argument that one should believe in Hell because if it does exist it's too high a price to pay if you don't believe in it. The fear of a hypothetical risk is being used to pressure into believing in it.

These arguments should be transparent to rational people, but the veil of intellectualism they are sometimes presented in can confuse even the brightest.

An Example

Late-stage Messiah Bias.

- Claiming we're all going to die

- Dictating what people should do

- Dictating who should have access to knowledge and resources

- Advocating to create an elite class with special privileges and access to power

The video above covers the listed items within a few minutes of the timestamp. I invite you to also watch the entirety for context.

Beware of "Availability Cascades"

There's a paper often cited in the study of public policy called Availability Cascades and Risk Regulation (Sunstein & Kuran 1999). In that work, legal scholar Cass Sunstein and economist Timur Kuran discuss the phenomenon of irrational risk assessments which become self-reinforcing by way of viral hysteria. They call this process an "availability cascade". Essentially, the mere discussion of a given risk makes the idea of it more available in the public's mind, and effectively more plausible by way of another cognitive bias — the "availability heuristic" (Tversky & Kahneman 1973). This adds more fear, and more discussion, and the process feeds itself.

Image Source: (Sunstein & Kuran 1999). Annotated.

Availability Cascade defined.

X-risk is especially susceptible to this process because "availability" is also informed by the ease with which the public can imagine a risk. Much groundwork has already been set by decades of popular sci-fi. Minimal stoking is needed by "experts" to accelerate this availability into a mass belief.

Importantly, there are also "availability entrepreneurs": those who seek to bring about certain regulations or public opinion. They campaign for the cascade as a vehicle for their agenda, profiteering, or reputation. Availability entrepreneurs interact with the news media, who are often in the same category and are always eager to discuss and latch on to popular trends. The Messiah Bias and the goals it is associated with are suited particularly well to the role of being an entrepreneur of fear.

Availability cascades are most problematic when they reach the attention of regulators, who often make hasty decisions supported by popular fears. This is something the AI community should pay attention to. It may sometimes seem to be of little importance if the public is dramatically overestimating X-risks ... but when it can lead to overly restrictive or burdensome regulations and limit access to technology, it becomes an issue to everyone.

Aligning against superstition

Alignment is the most important thing we will do with AI. Its real benefit won't be just making sure that hypothetical AI doesn't fail, harm or supplant us. It will be making the technology able to propel us —and making us able to propel it— to new heights.

This will require an unbiased appraisal of X-risk. We should grow a collective immunity against factors which jeopardize rational thought with fear, stifle progress, or seek to assign power to only an "elite" class.

Citation

Grealy, Desmond. (Sep 2023). AI Alignment and The Messiah Bias. AI Breakout. https://www.ai-breakout.com/post/ai-alignment-and-the-messiah-bias

@article{grealy2023messiahbias,
  title   = "AI Alignment and The Messiah Bias",
  author  = "Grealy, Desmond",
  journal = "ai-breakout.com",
  year    = "2023",
  month   = "Sep",
  url     = "https://www.ai-breakout.com/post/ai-alignment-and-the-messiah-bias"
}

References

[1] "AGI Safety | Connor Leahy | FLI Interpretability Conference, MIT | March 2023" YouTube, uploaded by Conjecture, Aug 25, 2023.

[2] Cass R. Sunstein & Timur Kuran, "Availability Cascades and Risk Regulation" 51 Stanford Law Review 683 (1999).

[3] Amos Tversky & Daniel Kahneman. "Availability: A heuristic for judging frequency and probability". Cognitive Psychology. 5 (2): 207–232 (1973).