
Even as Large Language Models such as ChatGPT become increasingly powerful, researchers continue to debate their true capabilities – whether they are merely mimicking the vast quantities of information they are fed, or if they are beginning to show human-like qualities of understanding.
Sanjeev Arora, founding director of Princeton Language and Intelligence, recently explored this debate in a non-technical talk, “Just a Parrot? -- How Language Models Learn, Reason, and Self-Improve.” A crowd of nearly 200 attended the Feb. 18 talk, part of PLI’s Large AI Model Lecture Series.
The idea that LLMs are merely “stochastic parrots,” meaning they create language by repeating information they’ve been fed without any understanding, was first argued in an influential 2021 research paper.
At that time, it was a “reasonable view” to hold, said Arora, who is also the Charles C. Fitzmorris Professor in Computer Science. But despite tremendous advances in LLMs since then, he said, the debate continues about whether they are capable of true originality. Arora mentioned an upcoming Princeton event on the subject, where science fiction author Ted Chiang will discuss “The Incompatibilities Between Generative AI and Art.”
In his talk, Arora drew on current research to explain why next-word prediction is more powerful than it looks, and the importance of post training to make LLMs more sophisticated. He described ways in which current models display awareness of their own skills and cognition – in other words, “metacognition.” They have shown capabilities that only six years ago sounded like science fiction, such as helping training other models, or even themselves – a process he labeled as “self-improvement.” Arora also provided an overview of how language models will evolve in the next few years.
Understanding the implications of these advances are crucial for students, educators and researchers today, Arora said.