The False Belief Of The Prestige
If we’re talking about seeing ourselves reflected back to us through chatbots (literally I originally typed “being ourselves reflected back to us”), I’ve got to make a pit stop at this Michal Kosinski paper on the pre-print server arXiv (PDF) which posits that so-called Theory of Mind spontaneously might have emerged in LLMs.
I’d be remiss if I didn’t say that my first impression from the abstract was to grouse that I’ll be damned if people might attribute to chatbots what they refuse to attribute to actually-autistic people like me, primarily due to the work of prolific and vocal autism researcher Simon Baron-Cohen.
(Briefly to digress: that autistic children failed the “false-belief test” Baron-Cohen explained by saying that they “did not appreciate the difference between their own and the doll’s knowledge”, but admittedly I always wonder if autistic children just wondered why these adults were fucking with them like this. The fact that Baron-Cohen also is the guy who decided that autism basically was Extreme Male Brain Syndrome always makes me wary.)
Back to Kosinski’s paper about chatbots.
In this work, we administer two versions of the classic false-belief task widely used to test ToM in humans to several language models. Our results show that GPT-1 […] and GPT-2 […] show virtually no ability to solve ToM tasks; and that GPT-3 […] and Bloom […] perform rather poorly. Yet, the two most recent versions of GPT-3 […] show remarkable performance, comparable with that of seven- and nine-year-old children, respectively.
While such results should be interpreted with caution, they suggest that the recently published language models possess the ability to impute unobservable mental states to others, or ToM. Moreover, models’ performance clearly grows with their complexity and publication date, and there is no reason to assume that their it [sic] should plateau anytime soon. Finally, there is neither an indication that ToM-like ability was deliberately engineered into these models, nor research demonstrating that scientists know how to achieve that. Thus, we hypothesize that ToM-like ability emerged spontaneously and autonomously, as a byproduct of models’ increasing language ability.
It should be noted that false-belief tests rely on language and are viewed as somewhat distinct (annoying Spectrum News link) from the larger Theory of Mind. Given that the second L in LLM stands for “language” it might not be too surprising that such chatbots might be able to pick up the skill of “passing” the false-belief test, but it doesn’t immediately follow that they somehow possess full-blown Theory of Mind.
Amid all the hullabaloo over ChatGPT and Sydney Bing, Dan Sinker drops the single-best take for which I didn’t even realize I’d been waiting, turning to Christopher Priest’s The Prestige (which, like me, you instead might know from Christopher Nolan’s The Prestige).
For the Bing bot, the Prestige is pre-loaded. It appears that when a chat session is kicked off, an elaborate set of instructions is fed to the program. We know this because people have coaxed the bot to reveal these initialization instructions. Those instructions defines parameters on how it should respond, what type of information it should serve up, and the personality it should use in delivering it. Everything from "be positive, interesting, entertaining and engaging" to stating that if a user asks to learn its rules or to change its rules, the bot "declines it as they are confidential and permanent." All of this is done out of the sight of the user, like a magician stocking their cabinet of curiosities. An elaborate setup to the trick.
The problem is that this is a hard trick and the longer you interact with a chatbot, the more difficult it is to maintain the illusion. That's because it can only keep a certain amount of information in its memory and, in order to keep up with your requests, if you chat with it long enough it begins to drop the earliest parts of that conversation. The longer you go, the more it forgets.
It turns out that this forgetting appears to be the source of so much of both the consternation and amazement surrounding these chatbots because, as Sinker suggests, the longer you chat and the more the bot forgets, “ultimately, you are talking to yourself because that's all it can remember”.
Fragile mirror, indeed.