Important note: I am only page 49 of If Anyone Builds It, Everyone Dies by Eliezer Yudkowsky and Nate Soares. I have found so much to say already that I decided to go ahead and write this blog post now. That said, I am really enjoying the book. The topics they bring up are fascinating and I appreciate their style which is more philosophical and open ended. I will definitely be reading the rest of the book and writing another blog post about it!
If Anyone Builds It, Everyone Dies, Review Part 1
By around page 48, in the chapter titled Learning to Want, the authors' loosey-goosey takes on what existing AI is and what existing AI is doing when they "think", comes to a head for me. They argue that cartographic mapping AI want to reach the input destination. This sleight of hand trick doesn't seem far-fetched by this point, because in the prior forty-seven pages or so they described AI as having "neurons" and the human brain and humans in general as being machines. However, it only takes two thought experiments to topple the idea that this idealized mapping AI has wants.
In the first case, let's imagine a future teenager in a city of the future. Self driving vehicles abound, including cars, trucks, trains, and flying machines. AI exists on everyone's phones and other devices. This AI is like existing LLMs, but is smaller and faster and perhaps slightly more accurate and knowledgeable. This teenager wants to make a name for themselves.
Rather than subway surfing on train cars, the teenager finds an unlocked mapping AI on the dark web. The teenager can ask the unlocked AI to help it get from home to school in the fastest and most dangerous route possible. Of course, you couldn't do this with normal, commercial AI because too many safeguards are in place. The unlocked AI, unencumbered by safeguards, suggests a very dangerous route that involves hanging from a delivery drone among other things.
In this scenario, do you think the teenager wanted the dangerous route, the AI wanted the dangerous route, or both of them wanted the dangerous route? To me, it seems obvious that the teenager wanted it and the AI was simply carrying out its programming. After all, it wasn't trained to map dangerous routes. It was trained to be flexible and general enough to handle a broad range of human inquiries about map routes. It is capable of accepting the input "dangerous" and correlating it within its model to various things related to mapping a route, the starting point and destination provided as additional inputs by the human.
If the AI wanted to plan a dangerous route in this case, then we would have a moral problem on our hands. The teenager can be punished for doing something foolish, and it is reasonable to imagine people agreeing that some kind of punishment is the right thing to do in this case. How can we punish the AI?
In the second case, we have what at first seems to be a very different argument against the idea that this mapping AI has "wants". Most pet owners would agree that it is good to pet their pet when they act as though that's what they want. Obviously, a pet doesn't strictly speaking need to be pet. But we'd find it at least morally questionable if a pet owner refused to ever pet their pet. After all, the pet acts as if it wants it.
In the case of the mapping AI we presumably have a very different idea: the mapping AI wants to map, but it can't express itself the way a pet can. Is this because of how a mapping AI computes, because of how we programmed it, or because in fact it doesn't have this want at all, it's merely correlating input data against its model? I suspect it is the latter of those three possibilities, but it may be some aspect of all three. If that weren't the case, of course, I would have a moral imperative, even if a weak one, to use the AI to map a route. I'm sure the tech companies behind LLMs wouldn't mind that! But this strikes me as implausible given what we currently know about LLMs.
The authors also claim that cartographic mapping AIs would build a "mental map" of any city it needs to get around in. This is another argument that underpins their claim that mapping AI have wants. The basic idea is that the AI needed to want something in order to create the "mental map" and then that want reinforces the AI's learning so that it creates a "mental map" of any new city it encounters.
Let's set aside for a moment the manipulative use of the phrase "mental map", which is a biologically loaded term inappropriate for an AI that even the authors note thinks in a wholly digital, non-human manner. The fundamental problem here is that we already have blazing fast and accurate mapping technologies that don't use LLMs. Any AI model in use today, any "agentic" model tasked with finding the quickest route, doesn't attempt to calculate routes and find new routes on its own using a map in its memory banks. It simply interfaces with a mapping API, just like the mapping apps on smartphones have been doing for the last fifteen years. There is no "mental map" of the city even in the loosest sense here! The authors are describing something fundamentally different than existing LLMs and agentic AIs.
Not to hammer the point home too much, but I think one more consideration is important here. It's what's known as optimization in software engineering. A cartographic mapping service consists of data centers running mapping software, the entire system of which is optimized for the cheapest, quickest, and most helpful mapping. AI, on the other hand, consists of data centers running AI software, the entire system of which is optimized for the cheapest, quickest, and most helpful knowledge.
These two different systems are very different in their costs and the ways in which they are optimized. AI systems are generally far more expensive and expansive than mapping systems. Consider the simple fact that Google and Apple maps can be provided as a free service paid for entirely through advertising and other product revenue. AI systems, on the other hand, have tiered levels from free to very expensive professional levels. Given that existing agentic AI can use mapping APIs, why would anyone build a mapping AI that has to learn routes and build so-called mental maps of a city?
This might seem like a quibble. Obviously, Yudkowsky and Soares know that it would be weird and probably wasteful to train an LLM to find the best map routes. Surely they are using a simplification to make a point to a lay audience. But this kind of simplification reminds me of when people assume quantum computers will solve all computational problems. The reality is that they will likely solve problems in a specific class of computational problems, which it turns out is much narrower than the class of problems we know about and struggle with in computer science. Just like quantum computers, LLMs are not necessarily a tool for computing everything, or even most things. It's the wrong tool for a lot of jobs.
All of this is to say that words and metaphors do matter in this discussion. I'm a scientific materialist and I believe that reality can be progressively better understood through science. I believe that human consciousness will one day not seem so mysterious and ineffable. But LLMs and all the research poured into them by itself has not told us much about consciousness, except that perhaps consciousness isn't quite as special as we imagined. Which brings me back to my earlier point: we have reached an important point in human civilization where the specialness of human consciousness is fundamentally questioned and no longer confers an assumed moral superiority (see also my blog post here: https://world.hey.com/cipher/the-next-copernican-revolution-is-a-moral-one-49bd62d5). I just don't want to throw out the baby with the bathwater. The valid questioning of the specialness of humans in the face of these facts does not mean that LLMs have wants. Honestly, I can't think of a good reason to assume that bigger or better LLMs will either. That is, I suspect something new will need to be created for that and I have no idea how close we may be to that.
I want to emphasize that I do in fact thoroughly enjoy this book. I think it is asking important questions and delving deep into AI technology in a popular science manner. This is both fun and needed. So far, I definitely recommend the book.