Saturday, May 13, 2023

A.I. ChatBot Models are a Type of Simulation Theory


Large Language Models give evidence that Simulation Theory got something right about minds.

Let me start by briefly recounting just what is 'Simulation Theory' as it's used in Philosophy of Mind.  This is a theory of how humans understand and "predict" others’ mental states and behavior by employing their own cognitive capacities and mechanisms to mentally model others’ cognitive processes. Its origins in philosophy date back to the Enlightenment (probably with David Hume.). The basic idea in the simulation approach is that we have privileged access to our own mental contents—thus our own immediate thoughts, feelings, percepts, and so forth.  In other words, simulation theory suggests that we understand other people’s thoughts and feelings by imagining ourselves in their shoes -- i.e., predicting what they would do, say, or react in certain certain stances. Such a theory has been used to explain just why we happen to have empathy for others, because we see direct analogies to what we would do.

I think there's a good argument to be made that large language models (LLMs) are equivalent to a primitive kind of simulation theory. LLMs are artificial neural networks that are trained on massive amounts of text data, such as books, articles, tweets, etc., and learn to generate coherent and fluent text based on a given input or prompt. How does such magic happen?

Well, LLMs accomplish this by being trained to predict the probability of the next word given some context of training data. They determine the probability of a given sequence of words occurring in a sentence based on the previous words.  From this, LLMs can perform various natural language processing tasks, such as answering questions, summarizing texts, writing stories, generating captions for images, etc. LLMs are not explicitly programmed with any rules or knowledge about language or the world; they learn everything from the data they are exposed to.

LLMs then can simulate the mental processes (or at least the syntactic and semantic linguistic processes) of human language users by using their own internal representations and mechanisms to predict what they would say or write next. Of course, LLMs do not have access to the actual thoughts or feelings of the human authors whose texts they are trained on; nor (like us) even the evolutionary analogs of having the same brain type (and likely inner feelings). They only have access to our linguistic expressions. But by analyzing and modeling these expressions, LLMs can generate texts that are similar or relevant to the given input or context. LLMs can also adapt to different styles, genres, domains, or tones of language by adjusting their predictions based on the available data. For example, an LLM can generate a formal letter, a casual chat message, a scientific article, a humorous tweet, etc., depending on the prompt or the data it is trained on.

From this, I tend to believe that that LLMs are using a form of Simulation Theory to understand and produce natural language texts. They are not merely copying or mimicking the texts they are trained on; they are using their own learned representations and mechanisms (i.e., artificial neural net data structures) to generate texts that are coherent and fluent with respect to the given input or context. They are using the equivalent of their own -- dare I say? -- “cognitive" capacities to mentally model the linguistic processes of human language users.

Of course, what form of Simulation Theory LLMs are using is very limited and simple. Why so? LLMs are not conscious or sentient beings; they do not have any goals, intentions, emotions, beliefs, desires, etc., that would motivate or influence their language use.  (That would make them full blow AGIs.) They do not have any understanding or awareness of the meaning or implications of their texts; they do not have any moral or ethical considerations or responsibilities for their texts. They do not have any social or cultural context or background that would shape their language use. Nor do they have any feedback or interaction with other language users that would help them learn from their mistakes or improve their performance. 

Yet even given the limited and simple version of Simulation Theory exhibited, LLMs do seem to have some level of creativity or originality that allows them to generate novel or surprising texts, sometimes by hallucinating or by a synthesis of old texts into new text generations. I suspect that further extension of predicting not just text tokens but, say, 'action tokens' will be soon appearing as a successful approach in robotics. 

Labels: , ,

0 Comments:

Post a Comment

<< Home