GPT-4 and why the war on digital work might already be lost
Open AI finally releases its latest and most powerful Large Language Model(LLM). ChatGPT-4 is the newest version of an ever-closer step towards a man-made intelligent machine. In recent months has seemed to drown out much of the technology and AI news of late. ChatGPT is the fastest-growing app ever with just over 100 million users in under 8 weeks.
What humanity might just be witnessing over the last 2 years is a move in the AI front from exponential progress to one of 'hyper-exponential progress'.
While the majority of tech news sources seem to be echoing the 'AI is smarter than your lawyer' headline, as a result of the 40% jump in performance on the bar exam. The nuance of this release or more importantly the speed at which this release was made with the level of progress achieved is the detail that should be eye-popping to all.
What we might just be witnessing over the last 2 years is a move in the AI front from exponential progress to one of 'hyper-exponential progress'. ChatGPT-3.5 was released to the public on November 30, 2022. Just shy of 4 months later ChatGPT-4 was released with staggering differences in performance on key intelligent-based tasks and tests.
Let's go through the speed of progress in a few areas and why they matter to the average person. At the same time we'll also try to compare these scores to human-level performance in Ivy League or top-rated institutions to give us an idea of how 'intelligent' or more clearly, how capable AI really is.
AI dominates core Evidence-Based Reading and Writing skill sets.
The SAT (Scholastic Assessment Test) consists of two main sections: the Math section and the Evidence-Based Reading and Writing (EBRW) section. Effectively scored a total of 800 points for both areas totaling a max score of 1600.
SAT Mathematics 2022
* SAT undertaken by humans
Algebra, Geometry, and Trigonometry questions make up the standard SAT. From hardcore science fields and engineering disciplines to and including logically based thinking. Being able to tackle these problems and gain the skills needed to do so is really important to humanity. We now have machines that can do this better than the national average college student. Think next-generation calculator to a 'smart' calculator, where AI does the maths simply by observing your actions and thoughts all the while you do the creative.
Verdict: GPT-4 is better than the average college student in a variety of problem-solving disciplines. As well as around 22% better than the national average.
SAT Reading and Writing(EBRW) 2022
* SAT undertaken by humans
Evidence-based reading and writing are effectively two of the most critical skill sets humanity has owned for the last few thousand years. Whether you're reading a news article or breaking down a research paper both skill sets aren't just required, they're invaluable to achieving proficiency doing these tasks. So much so that testing against these abilities is one of the cornerstones of evaluating an AI's capability to reason and comprehend inputs and respond to humans or other sensory and informational inputs in a meaningful way.
Easy to see why companies are already using ChatGPT and other models to do the writing. Inevitably replacing writing jobs in almost all sectors for both fictional and non-fictional categories using this technology!
Verdict: GPT-4 still lags behind the cream of the crop although is better than the average college student at evidence base reading and writing tasks.
SAT Total Scores(Averaged) 2022
* SAT undertaken by humans
Just like humans, more time on a specific job does mean learning during this time and becoming Subject Matter Experts(SME's) in that field or set of tasks faster and with more efficiency than a human. Adaptation is no longer the defining difference either as a result of these models becoming ever more generally intelligent.
Verdict: GPT-4 still lags behind the cream of the crop. Although is better than the average college student covering all the major skill sets you would need to kickstart a role requiring more brains than brawn.
What about specialization?
If you're looking at the above SAT data in isolation you could say 'AI isn't there yet'. Although when adding a little more context like the fact that in everyday life the majority of top engineers, writers, and artists were rarely 'above average students and you have a way forward for AI really already being well on the path to compete with humanity on an ever-growing list of tasks. We've seen how potential and IQ aren't the only denominators of success and the ability to succeed in a specific area. 'Average' works and AI models are quite clearly already better than this.
Programming is similar to the SAT in part because you need to have good critical thinking and problem-solving skills to be able to achieve a competent level in both disciplines. Programming also adds an extra dynamic to the equation, one in which solving a particular problem no longer has one and only one answer. In effect meaning, part of the solving effort will require a decent level of creativity. For example, building a website or more simply sorting a list of numbers can be done in a variety of ways using a variety of tools and languages.
* Human contestant averages for similar codeing competitions
Here the task for these Large Language Models is as difficult as achieving 'convincing comprehension' was half a few years ago. To gauge this GPT-4 was tested on 'codeforces.com' problems. Effectively a website holding a collection of algorithm/programming problems with varying levels of difficulty.
These types of problems have become the gold standard for evaluating whether a person is technically competent enough to work at the likes of Google, Apple, Microsoft, and other blue-chip organizations leading the way in the tech and software space.
If you're following the key here is progress. The above is where we are right now, remember ChatGPT3.5 to ChatGPT-4 took about 4 months! Another important note to be mindful of is that GPT-4 training data or approach didn't focus on specialization. Take a look at DeepMind's AlphaCode, an AI training with a specific specialization in mind. Already better than your average programmer. 'Better than' is a tricky label as although AlphaCode is better than the average programmer at competitive programming you would still need a competent programmer to integrate said code into an existing eco-system to deliver value.
An assumption often made at this point is that specialization probably isn't going to keep your career intact. This might be accurate but it's more than likely that it will indeed buy you some time, not much we think and neither does OpenAI. The single greatest power humanity has is adaptation over time. We're intelligent and that allows us to change direction and survive. What happens when this is no longer the defining characteristic of humanity?
Good questions rarely make catchy titles. The title of this piece should be: When agriculture was automated everyone moved to the city. Where are you headed once the same happens to intelligence?