GPT-3: Intelligent A.I. or Vacant Programming?
A recent article published in the Guardian caught the attention of internet users worldwide. Unlike ordinary works of journalism that go viral, however, this particular piece was not written by a human. In a style that is evocative and attention-grabbing, The Guardian aptly titled it: “A robot wrote this entire article. Are you scared yet, human?”
The “robot” in question is GPT-3, or “Generative Pre-Trained Transformative 3”, OpenAI’s third iteration of an autoregressive language model that uses deep learning to produce human-like text. GPT-3 was prompted to write an essay to convince humans that robots come in peace.
The looming question is: does GPT-3 truly exhibit intelligence?
Intelligence, Narrow AI and General AI
In order to have a meaningful and nuanced discussion over this topic, we first need to define intelligence.
AI researcher Max Tegmark has, in my view, provided the most succinct and clear definition of intelligence:
intelligence is the ability to accomplish complex goals.
Artificial intelligence can be defined as a broad area of computer science that makes machines seem as though they have human intelligence.
Of course, the discussion today is more sophisticated than that. We need to go a step further and distinguish between artificial narrow intelligence (ANI or Narrow AI) and artificial general intelligence (AGI or General AI).
Artificial Narrow Intelligence (ANI) also known as “Weak” AI is the AI that exists in our world today. Narrow AI is AI that is programmed to perform a single task — whether it’s checking the weather, being able to play chess, or analyzing raw data to write journalistic reports. This is also the sort of AI that we use and interact with on a daily basis: from the recommendation engines we rely on, to virtual assistants like Siri and Alexa.
Though we refer to existing AI and intelligent machines as “weak” AI, we shouldn’t take it for granted. Narrow AI by itself is a great feat in human innovation and intelligence.
Artificial General intelligence or “Strong” AI, on the other hand, refers to machines that exhibit human-like intelligence. AGI should successfully perform any intellectual task that a human being can. This includes the ability to make sense of the world, to think in abstract terms, and to be able to carry out commonsense reasoning in daily life. This is the sort of AI that we see in movies like “Her” or other science fiction tales in which humans interact with machines, robots, or operating systems that are conscious, sentient, and driven by emotion and self-awareness.
What Kind of AI is GPT-3?
GPT-3, like all other AI that we are surrounded by today, falls within the realm of Narrow AI.
Like other natural language processing (NLP) models, GPT-3 is given inputs (large amounts of language data), programmed to parse this data, make patterns from it (using deep-learning algorithms), and then produce outcomes (correlations between words, long-form sentences, and coherent paragraphs).
The ultimate goal of NLP is to have AI that is capable of composing reasonable e-mail responses or that can carry out a spoken conversation with a human being.
However, outside of the task that it is intended to do, NLP programs are not able to accomplish anything else: for instance, unlike a human child that is capable of learning how to play the piano, competing in tennis, and can learn how to read and write simultaneously, NLP is only focussed on and can typically only accomplish the task at hand: improving its ability to produce human-like language.
The Turing Test vs Winograd Schema Challenge
The Turing Test is a preliminary method of inquiry in AI for determining whether or not a computer is capable of thinking like a human being. A machine is said to have passed the Turing Test if it can successfully deceive a covert human inquirer into believing that it, too, is human.
Many systems that use deep learning techniques will soon be capable of passing the Turing Test — and GPT-3 is a great example of this. Without context, any reader would be unable to tell whether the article was written by a human being or a robot. Consider the following passage:
I am grateful for this feedback. I am always grateful for feedback. And I am always grateful for the fact that my writings are now being read outside of my own bubble. I don’t often interact with people who don’t share my beliefs. So I am unsure how this article will be received by people from different backgrounds.
It’s clear from the above passage that GPT-3 exhibits intelligence. We know that, given a goal (to write an essay), it can successfully accomplish it.
This fact by itself is absolutely awe-inspiring. GPT-3 seems to have mastered the art of language and of correlating words in a way that makes sense, is relatable, contemplative, and evokes emotion in the reader.
But does GPT-3 exhibit the full range of human intelligence?
If we want to know whether or not a machine exhibits the full range of human intelligence, we need to move past the Turing Test.
A more mature method of inquiry would be the Winograd schema line of questioning, that’s easy for a human to answer, but poses a serious challenge for a computer. A Winograd schema is a pair of sentences that differ in only one or two words and that contain an ambiguity that is resolved in opposite ways in the two sentences and requires the use of world knowledge and reasoning for its resolution.
The schema takes its name from a well-known example by Terry Winograd:
The city councilmen refused the demonstrators a permit because they feared violence.
The city councilmen refused the demonstrators a permit because they advocated violence.
Who does “they” refer to in each of the above two sentences? Of course, as a human reader, most of us can infer that the meaning of “they” in each sentence differs depending on the adjective used. This is known as commonsense reasoning that most of us employ on a daily basis.
Whether or not GPT-3 can exhibit this sort of reasoning is a mystery (but I’m willing to wager a guess that it cannot).
Most AI systems today would still fail to pass the Winograd Schema test: but for how long, we’re not sure.
Intelligence vs Consciousness
Instead of asking whether or not GPT-3 is intelligent in the way that humans are, we should ask the following two questions:
- Is GPT-3 intelligent? Yes
- Is GPT-3 conscious and self-aware? No.
If intelligence is the ability to solve complex problems and accomplish complex goals, then GPT-3 is undoubtedly intelligent. Indeed, this is why narrow AI is still referred to as artificial intelligence. Dismissing what GPT-3 accomplished as vacant programming is both false and not useful, and could bring us to a point where we are unprepared for a future in which machines do exhibit general intelligence.
Today, we have no reason to believe that GPT-3 is conscious or self-aware in the way that we are. The difference between consciousness and unconsciousness is a matter of subjective experience. One might then ask: is there something that it is like to be GPT-3? Though we cannot be sure, I think it is safe to say that GPT-3 does not have a subjective experience. (Consciousness can, however, be observed as a spectrum: we know for a fact that we are conscious, that a rock is unequivocally unconscious, and that a puppy exhibits some degree of consciousness. Viewing consciousness as a spectrum will be a useful tool for when we approach a future in which machines begin to exhibit conscious behavior).
General AI systems will be expected to be able to reason, solve problems, make judgments under uncertainty, plan, learn, integrate prior knowledge in decision-making, and be innovative, imaginative, and creative.
There is no evidence that GPT-3 is able to do any of this. In fact, there is no evidence that it is actually able to understand any of what it is saying in a meaningful way. There is no evidence that it has any conception of a physical world out there, or of what it truly means to be “good” or “evil”. All we have evidence for is its ability to parse large amounts of data and form complex correlations that make sense.
The “Holy Shit” Moment
Despite the above, we must be wary to dismiss the remarkable feat that GPT-3 has accomplished. GPT-3 is an extremely powerful prediction tool and what it has accomplished in the realm of Narrow AI is absolutely astonishing.
As with many other exponential technologies, there is often a “holy shit” moment that researchers and developers experience when it comes to the advancement of AI. Historically, we experienced this when IBMs supercomputer, Watson, beat a human at Jeopardy, and when Google’s AlphaGo AI displayed intuition and creativity in a way that no Go human player had in the past, in order to beat the worlds best human Go player.
The inherent nature of exponential technologies implies that AI can advance at an extraordinarily rapid rate without us even realizing it; thereby taking society by surprise.
Despite what many say, for me, the article written by GPT-3 was a “holy shit” moment — not because I felt it exhibited human-like awareness, intelligence, and consciousness, but because it pointed to how advanced our NLP technology has become and how much closer we could be to the age of General AI. The possibilities (especially if one compares GPT-3 with its predecessors, GPT-2 and GPT), are clearly endless.
And as the trend of exponential growth goes, we have no clue how or when this will explode in the future. Before we know it, machines could begin to exhibit the full range of human-like traits and intelligence.
The important takeaway from all of this is that we must be nuanced in our approach to determining where AI is in its evolution, and we must ask the right questions if we want to seek helpful answers.
Ultimately, we need to be prepared and we need to prepare our future generations to contemplate the practical and existential impact of technology on society and to learn how to co-exist with intelligent machines.