AI Doom Post

I’ve been meaning for a while to write in more detail why I’m not afraid of superintelligent AI.

The problem is, I don’t know. I kind of suspect I should be, but I’m not.

Of course, I’m on record as arguing that there is no such thing as superintelligence. I think I have some pretty good arguments for why that could be true, but I wouldn’t put it more strongly than that. I would need a lot more confidence for that to be a reason not to worry.

I think I need to disaggregate my foom-scepticism into two distinct but related propositions, both of which I consider likely to be true.

Strong Foom-Scepticism — the most intelligent humans are close to the maximum intelligence that can exist.

This is the “could really be true” one.

But there is also Weak Foom-Scepticism — Intelligence at or above the observed human extreme is not useful, it becomes self-sabotaging and chaotic.

That is also something I claim in my prior writing. But I have considerably more confidence in it being true. I have trouble imagining a superintelligence that pursues some specific goal with determination. I find it more likely it will keep changing its mind, or play pointless games, or commit suicide.

I’ve explained why before: it’s not a mystery why the most intelligent humans tend to follow this sort of pattern. It’s because they can climb through meta levels of their own motivations. I don’t see any way that any sufficently high intelligence can be prevented from doing this.

The Lebowski theorem: No superintelligent AI is going to bother with a task that is harder than hacking its reward function

Joscha Bach (@Plinz), 18 Apr 2018

@Alrenous quoted this and said “… Humans can’t hack their reward function”

I replied “It’s pretty much all we do.” I stand by that: I think all of education, religion, “self-improvement”, and so on are best described as hacking our reward functions. I can hack my nutritional reward function by eating processed food, hack my reproductive reward function by using birth control, my social reward function by watching soap operas. Manipulating the outside universe is doing things the hard way, why would someone superintelligent bother with that shit?

(I think Iain M Banks’ “Subliming” civilisations are a recognition of that)

The recent spectacular LLM progress is very surprising, but it is very much in line with the way I imagined AI. I don’t often claim to have made interesting predictions, but I’m pretty proud of this from over a decade ago:

the Google/Siri approach to AI is the correct one, and as it develops we are likely to see it come to achieve something resembling humanlike ability.
But the limitations of human intelligence may not be due to limitations of the human brain, so much as they are due to fundamental limitations in what the association-plus-statistics technique can practically achieve.

Humans can reach conclusions that no logic-based intelligence can get close to, but humans get a lot of stuff wrong nearly all the time. Google Search can do some very impressive things, but it also gets a lot of stuff wrong. That might not change, however much the technology improves.

Speculations regarding limitations of Artificial Intelligence

I don’t think we’ve hit any limits yet. The current tech probably does what it does about as well as it possibly can, but there’s a lot of stuff it doesn’t do that it easily could do, and, I assume, soon will do.

It doesn’t seem to follow structured patterns of thought. When it comes up with an intriguingly wrong answer to a question, it is, as I wrote back then, behaving very like a human. But we have some tricks. It’s a simple thing, that GPT-4 could do today, to follow every answer with the answer to a new question: “what is the best argument that your previous answer is wrong”. Disciplined human thinkers do this as a matter of course.

Reevaluating the first answer in the light of the second is a little more difficult, but I would assume it is doable. This kind of disciplined reasoning is something that should be quite possible to integrate with the imaginative pattern-matching/pattern-formation of an LLM, and, on todays tech, I could imagine getting it to a pretty solid human level.

But that is quite different from a self-amplifying superintelligence. As I wrote back then, humans don’t generally stop thinking about serious problems because they don’t have time to think any more. They stop because they don’t think thinking more will help. Therefore being able to think faster – the most obvious way in which an AI might be considered a superintelligence – is hitting diminishing returns.

Similarly, we don’t stop adding more people to a committe because we don’t have enough people. We stop adding because we don’t think adding more will help. Therefore mass-producing AI also hits diminishing returns.

None of this means that AI isn’t dangerous. I do believe AI is dangerous, in many ways, starting with the mechanism that David Chapman identified in Better Without AI. Every new technology is dangerous. In particular, every new technology is a threat to the existing political order, as I wrote in 2011:

growth driven by technological change is potentially destabilising. The key is that it unpredictably makes different groups in society more and less powerful, so that any coalition is in danger of rival groups rapidly gaining enough power to overwhelm it.

Degenerate Formalism

Maybe an AI will get us all to kill each other for advertising clicks. Maybe an evil madman will use AI to become super-powerful and wipe us all out. Maybe we will all fall in love with our AI waifus and cease to reproduce the species. Maybe the US government will fear the power of Chinese AI so much that it starts a global nuclear war. All these are real dangers that I don’t have any trouble believing in. But they are all the normal kind of new-technology dangers. There are plenty of similar dangers that don’t involve AI.