Note: These are all rough numbers, I'd expect I'd shift substantially about all of this on further debate.
Suppose we made humanity completely robust to biorisk, i.e. we did sufficient preparation such that the risk of bio catastrophe (including AI mediated biocatastrophe) was basically 0.[1] How much would this reduce total x-risk?
The basic story for any specific takeover path not mattering much is that the AIs, conditional on them being wanting to take over, will self-improve until they find they find the next easiest takeover path and do that instead. I think that this is persuasive but not fully because:
1. AIs need to worry about their own alignment problem, meaning that they may not be able to self improve in an unconstrained fashion. We can break down the possibilities into (i) the AIs are aligned with their successors (either by default or via alignment being pretty easy), (ii) the AIs are misaligned with their successors but they execute a values handshake, or (iii) the AIs are misaligned with their successors (and they don't solve this problem or do a values handshake). At the point of full automation of the AI R&D process (which I currently think of as the point at which AIs become more useful than humans for making AI progress, i.e., if we remove all AIs, progress slows by more than if we remove the humans), conditional on the AIs being misaligned, I currently think the relative likelihood of (i), (ii) and (iii) is 1:1:3, and the probability flows from (iii) into (ii) and (i) as the AIs get smarter.[2]
2. Raising the capability threshold that the AIs need to takeover gives humanity more time. During this time, we might make alignment progress or notice that the AIs are misaligned and slow down / pause / add mitigations.
So it might be important for misaligned AIs to attempt a takeover early in the intelligence explosion. Specifically, we can ask "how much x-risk is averted if the probability of misaligned AI takeover before TED AI goes to 0?", whi