What is AI’s true test?
AI’s true test is not the toughest exam in the world — Gaokao[1], JEE[2], or the UPSC[3]; it’s a concept from 1950.
Picture this: you’re texting an unknown number, discussing an indisputably niche, highly opinion-driven topic that very few others would know much about; it could be a fierce debate on Gordon Ramsay’s worst rampage of rage and ranting, or a nuanced exchange on the ideal time to pan-sear a salmon filet. It goes on for an hour, only for you to realise that you had been talking to an artificial intelligence model this whole time.
Surprising? Yes.
Absolutely shocking? Well, in times like these, probably not. Creating chatbots to employ human-like conversational skills has been one of the backbones of artificial intelligence over the last few years.
But commendable, nonetheless? Of course — we have to appreciate the effort put into creating an algorithm that can pass for a human on text, at least. This article, however, doesn’t recognize that.
The aforementioned scenario is a more colloquial recitation of the brainchild of Alan Turing, a man often considered one of the fathers of AI and modern computing[4][5][6] — the Turing Test.
A bit about Alan Turing
To utilise the computing technique of abstraction here, I shall only provide you with snippets of information that can cascade into further research on your part.
Alan Turing is probably one of the most well-known personalities in the fields of algorithmic thinking, logic, and computer science. His life and works, including the Turing Test, have formed the basis for numerous movies over the last decade or so[7]. Whilst at the University of Manchester, Alan Turing publicized his idea of the Turing Test in a paper entitled “Computing Machinery and Intelligence.”
How does the Turing Test work?
The Test, simply put, is a methodology proposed to determine the ability of a machine to demonstrate intelligence on par with the average human. As we know, the ethos behind the creation of AI is to mirror human intellectualities and neural capabilities. Of course, then, being unable to differentiate interaction with AI from interaction with humans would be a sign that AI has fulfilled its purpose, right?
This is precisely what Turing considered when he developed the Turing Test. Consider the image below. You’re an interrogator sitting in a closed room. In front of you are two windowless rooms, one of which has a human with the cognitive capabilities of an average Homo sapiens, let’s call this muscle and nerve-powered entity Steve. In another room is a machine — a Large Language Model[8].
On one terminal, you ask a set of questions specific to one subject area. This is repeated with the same questions on another terminal. Within, say, 5 minutes, would you be able to gauge which terminal is associated with Steve and which is linked to the AI? If the algorithm has fulfilled its purpose, you will not be able to figure this out more than a certain % of the time when this test is repeated.[9] I use ‘certain’ because consensus isn’t for any value — sources vary, ranging from 30% to 50% to 100%.
Have we passed the Turing Test yet?
Since the grounds for AI tech have been established, developers have strived to ace this very test. Of the myriad of AI models created — more than one can probably track — the proportion of software that have supposedly “passed” this test is laughable, unbelievable, even. Sources across the web have contrasting perspectives, but if any software can boast passing the Turing Test, it is Eugene Goostman — a model that emulates a 13-year-old Ukrainian boy[10]. It is important to note that the event during which Goostman was tested considered 30% as the baseline to deem the AI model as having passed the test, i.e., if more than 30% of the judges couldn’t decipher whether a source of responses was a human or Goostman, Goostman had passed. Some sites also claim that systems such as Cleverbot and Elbot have also passed the test. Agreement in this case, as well, is not universal.
This was in 2014, though. We have made significant progress in the sophistication of AI models in the last 9 years — generative platforms like Open AI’s GPT being the most significant here. I think at this point, even without conducting formal research in this regard, we can be fairly certain that platforms like ChatGPT have passed the Turing Test. As much its responses fall in the uncanny valley of speech, and as often as it does hallucinate or get stuck in a loop of same responses, it has been successful in emulating human tone for the most part. Sources online also agree here[11][12][13][14].
Is this the end, then?
Well, yes and no.
It is quite difficult to argue against ChatGPT and its contemporaries having passed the test — they fool teachers, bosses and even AI-detection algorithms on a daily basis. Of course, there are improvements on the horizon that would concretize these models’ compete replication of human intonation and diction, further establishing their victory over the Turing Test.
Still, the primary critique of the current Turing Test is the subjectivity it requires from the judges determining the source of responses. Even if we add thousands of judges, we cannot remove this subjectivity. One would argue that any test of this nature should require objectivity.
This is where my perspective is met with difficulty.
My personal opinion is: when trying to figure out if something can be human, won’t the best judge just be a human? Wouldn’t this just be the last rung of the ladder, then? Shouldn’t we just accept that AI models have, with certainty, become human enough to a point where their actions in a particular field are indistinguishable from ours?
Though this may seem morose and hopeless, wasn’t this our ultimate objective with AI? Why are we, then, dejected to achieve what we wanted?
[1] https://www.chinaeducation.info/standardised-tests/k12-tests/gao-kao-entrance-examination.html
[2] https://jeemain.nta.nic.in/about-jeemain-2023/
[4] https://www.britannica.com/biography/Alan-Turing#:~:text=Turing%20was%20a%20founding%20father,part%20a%20digital%20computing%20machine.
[5] https://www.newscientist.com/people/alan-turing/#:~:text=Often%20considered%20the%20father%20of,the%20basis%20for%20artificial%20intelligence.
[6] https://www.datatrained.com/post/father-of-artificial-intelligence/#:~:text=Alan%20Turing%20is%20called%20the,to%20define%20modern%20AI%20development.
[7] https://www.imdb.com/search/keyword/?keywords=turing-test
[8] https://machinelearningmastery.com/what-are-large-language-models/
[9] https://www.techtarget.com/searchenterpriseai/definition/Turing-test
[10] https://www.bbc.com/news/technology-27762088
[11] https://www.mlyearning.org/chatgpt-passes-turing-test/
[12] https://mpost.io/chatgpt-passes-the-turing-test/
[13] https://www.fortressofsolitude.co.za/chatgpt-ai-passes-the-turing-test-have-we-developed-skynet/
[14] https://www.techradar.com/opinion/chatgpt-has-passed-the-turing-test-and-if-youre-freaked-out-youre-not-alone