Elon Musk filed a lawsuit in San Francisco’s Superior Court accusing OpenAI and its CEO, Sam Altman, of betraying the startup’s initial commitment to openness, the betterment of society, and lack of profit as a motive. Among other things, Musk’s 35-page complaint argues that OpenAI has violated its original deal to share its GPT large language models with Microsoft, which stated that the software giant would lose access to new LLMs once OpenAI had achieved AGI. According to the complaint, OpenAI reached that epoch-shifting moment a year ago with GPT-4, its most powerful model to date.
Musk—who cofounded OpenAI but left in 2018—is at least as entitled as anyone to come up with his own definition of AGI. His complaint describes it as “a general purpose artificial intelligence system—a machine having intelligence for a wide variety of tasks like a human.” That does sound like GPT-4 as I, a mere layperson, experience it in ChatGPT Plus.
But Musk’s declaration that the AGI era is already upon us is hardly the consensus among AI scientists. Even those who think it’s not far off predict arrival dates that are least a few years away. And GPT-4 falls well short of meeting OpenAI’s own explanation of the term: “A highly autonomous system that outperforms humans at most economically valuable work.”
Consider the evidence:
GPT-4 isn’t remotely autonomous; indeed, it does its best work when humans provide plenty of hand-holding in the form of detailed prompts. The world is still in the process of figuring out what tasks GPT-4 can do, and we frequently overrate its competence. That’s not even getting into the fact that OpenAI’s reference to “most economically valuable work” suggests that true AGI may involve not just software but also sophisticated robotics that don’t exist yet. To guess when OpenAI—or a rival such as Google, Anthropic, Meta, Mistral, or Perplexity—might reach AGI, as OpenAI defines it, is to expect that it’ll be an obvious moment in time. But OpenAI’s definition, like all the others, is squishy and difficult to put to a conclusive test. To riff on Supreme Court Justice Potter Stewart’s famous comment about pornography, maybe we’ll know it when we see it. At the moment, however, I’m convinced that obsessing over AGI’s existence or nonexistence is counterproductive.
The whole notion of AGI is predicated on the assumption that AI started out dumber than a human but could someday match or exceed our level of thinking. Already, though, generative AI is different than human intelligence—far closer to omniscient than any individual flesh-and-blood thinker, yet also preternaturally gullible and prone to blurring fact and fiction in ways that don’t map to common human frailties. That’s because it’s a predictive engine, trained to string together words without truly understanding them. If its present trajectory of simulated brilliance mixed with boneheadedness continues, it might wander off in a direction far afield from most definitions of AGI.
Even if the world lands on a new, more inclusive definition of AGI, it may be hard to prove whether a particular LLM has attained it. Musk’s lawsuit cites proof points of GPT-4’s reasoning power, such as its scoring in the 90th percentile on the Uniform Bar Exam for lawyers and the 99th percentile on the GRE Verbal Assessment. That it can do so is astounding. But acing tests is not synonymous with performing useful work. And even if it were, who gets to decide how many tests an LLM must pass before it’s achieved AGI rather than just bobbled somewhere in its vicinity?
For decades, the Turing Test—which a computer would pass by fooling a human into thinking that it, too, was human—was computer science’s beloved thought experiment for determining when AI had gotten real. Strangely enough, it’s useless as a tool for assessing today’s LLM-based chatbots. But not because they know too little to fake humanity convincingly, or can’t express it glibly enough—but because they betray their artificiality by being so good at churning out endless wordage on more topics than any human knows. AGI could end up in a similar predicament: a benchmark, devised by humans, that’s rendered obsolete by the technology it was meant to measure.
DID YOU HEAR THE ONE ABOUT THE “MAC CAR?” Last week, Apple’s long, expensive quest to build an autonomous EV entered its rearview-mirror phase—a sad fate my colleague Jared Newman blamed on the company’s sometimes counterproductive pursuit of perfection. Wondering what an Apple car would be like has been an obsession for techies since 2012, when news broke that Steve Jobs had toyed with getting into the automobile business even before there was an iPhone. Or maybe it started in 2008, when reports of a meeting between Steve Jobs and Volkswagen’s CEO led to wild speculation about an “iCar.”
Or how about 1998? According to Snopes, that’s when a joke involving cars designed by software companies began spreading like crabgrass across the internet, eventually evolving into an urban legend involving a Bill Gates keynote and a General Motors press release. Along with a Microsoft car that crashed twice a day and occasionally needed its engine replaced for no apparent reason, it mentioned a “Mac car” that “was powered by the sun, was reliable, five times as fast, twice as easy to drive—but would only run on 5% of the roads.”
if everybody says ‘the earth is a square’, these A.I will say ‘the earth is a square’ : there is no intelligence, it’s just a summary of what everybody says. if one day, a machine is able to say : you’re wrong, the earth is a sphere, and here is the reason…, then i will say ’ ok, you’re a real A.I ’
In a way, human intelligence is like that.
People used to think earth used to be the centre of the universe, because everybody said so. Would you say that only Nicolaus Copernicus was intelligent?
Other people had the capability to do what Copernicus did, but lacked desire/resources. A LLM will never have the capability for a novel idea.
They also would lack the “desire” and resources to do so.
They can’t act of their own volition without input, and they can’t access systems they were not designed to interface with and data that they were not trained on or given through the input.
I think it’s preferable that way, given the immense overhyping of this technology that is ocurring, and the existing cases of misuse.
An LLM may not. Will an AI?
If by AI, you mean the things people are making today and calling AI, no, they’re all basically powerful regression algorithms. They can be strong tools for people to use to solve complex problems. Anything a program does will be based on what it was programmed to do, at best it will find novel things based on being programmed to look for novel things randomly and people will test and confirm those guesses. They already kind of do this for some medical purposes. Is trying an uncountably large number of randomized guesses and giving a probability for success based on historical data intelligent?
Could a true AI exist like we see in SciFi, maybe?
How do you get AI to change its answer when one researcher discovers what was generally accepted as fact is no longer true?
They have to update the training data with the latest findings. Some AI models may use external sources to fetch the most current information.
But the most current information does not mean it is the most correct information.
I could publish 100 papers on Arxiv claiming the Earth is, in fact, a cube - but that doesn’t make it true even though it is more recent than the sphere claims.
Some mechanism must decide what is true and send that information to train the model - that act of deciding is where the actual intelligence in this process lives. Today that decision is made by humans, they curate the datasets used to train the model.
There’s no intelligence in these current models.
Flat earthers have access to all the information yet they still decide that flat earth is true.
I am not saying that current AI is intelligent. I am just seeing similarity between how human and AI process information.
Humans are intelligent animals, but humans are not only intelligent animals. We do not make decisions and choose which beliefs to hold based solely on sober analysis of facts.
That doesn’t change the general point that a model given the vast corpus of human knowledge will prefer the most oft-repeated bits to the true bits, whereas we humans have muddled our way through to some modicum of understanding of the world around us by not doing that.
human behavior is like that : we repeat what we heard, so A.I is just copying the human behavior, it’s not a proof of ’ intelligence ’ for me. ( I’m in no way intelligent ). But may be these A.I will be able to have a ‘relationship intelligence’ : Know how to manipulate human behavior is probably a kind of intelligence.
That’s a problem with the technological comparison model of intelligence. We have dealt with it since the inception of calculators. Humans are not machines. Machines can emulate behaviors, but they are not and will never be like human entities. We have used them as metaphors for everything from mathematical thinking to memory and visual processing. But the truth of the matter is that, both neurological and phenomenologically speaking, we don’t think like machines, and we are not anything like them.
Humans don’t just repeat what we hear, just like artist don’t just mix and mash all the art they have seen in their lives the way stable diffusion and other image generators do. There’s a lot of things underlying the superficial process of stringing words together or composing a drawing or painting, happening in our brains that machines cannot do.
Which AI it would be : emotional, logical, spatial, etc ? Because there is not one intelligence in the humand mind, but several.
What will be very amazing, will be an AM, an artificial mind.
It seems to me that the long experiment playing out may include simply waiting to see if there is a critical mass threshold to be reached (ie, of this LLM “simply repeating what everyone agrees on” idea) that allows the process to evolve into something closer to “thinking.”
I’m sure I don’t know enough about LLMs, but as others others here are pointing out, this seeming regurgitation of the already-known does seem to provide the foundation or potential for generating hypotheses and/or “new” ideas.
critical mass threshold to be reached
There isn’t. No amount of computational accumulation can result spontaneously into a mind. There’s not enough flexibility and malleability in the underlying processes (algorithms) that run LLMs. The process never changes therefore it cannot evolve into something other than what it already is. It’s like adding pixels to a monitor, no amount of pixels will ever spontaneously morph into reality. The switch from a 2D representation of a 3D world is not something that is possible.
Funny how goals have evolved. From making a machine that is like a human to making one that is not.
Huh? I think you may have misinterpreted his comment. He’s looking for a machine that is like a human (capable of reason).
If everybody says ‘the earth is a square’, he wants AI to come to a different conclusion.
Indeed. Because what everybody says is the data it’s trained on and has nothing to do with what people actually know/understand. Kinda like how I can say “the sky is orange”… crazy, right?
The idea is that a “real” AI should be able to detect implied inconsistencies in the training data and point them out?
Indeed. It should be able to reason. Like a human. Not a hard idea to grasp tbh.
What does “reason like a human” mean to you?
i heard this is where q* learning comes to play, this algorithm will allow it to reason
It’s all bullshit marketing hype until we actually see it. There’s no reason to believe AI will advance better than linearly in the next 5-10 years.
yes but the algorithm is real though
Ah, irony. It’s common for people to say “AI art generators are just collage machines, copying and pasting bits of existing images together, unable to generate anything novel.” I guess there’s no intelligence there either, they’re just parroting each other.
This just fundamentally doesn’t understand what artificial general intelligence means. It’s not a fancy way of saying “human but smarter”. That’s just wrong. It’s an artificial intelligence that’s good at a lot of different things. You know. General. Someday, if we live long enough, we will create an AI capable of figuring things out that it wasn’t designed for and we didn’t teach it. Maybe that will be tomorrow. Maybe it’s still centuries away. It’s actually really hard to figure out how long it will take us. Making a really good text generation algorithm doesn’t make the concept of learning more than one thing obsolete though. And predicting what text goes in a bar exam after being given a massive sample of bar exams isn’t the same thing as learning to be a lawyer.
Tech bros with more money than sense suing each other over terms they don’t understand is not a rational system to base research off of, and we should ignore them.
Yeah absolutely. Even AI as a term has become a crock of shit because it’s been latched on to by companies to market their products in the AI the equivalent of the dotcom boom.
Artificial Intelligence was once a sufficient smterm for Artificial General Intelligence. Now any old algorithm is being labelled AI to sell it.
But the terms don’t matter - the concept is sound but it’s further away than we probably expect because so much crap is being sold to make a quick buck.
Chat-GPT is basically beta software and it is practically useless because it’s inaccurate. You can’t use a tool in business, government or health are when it can be wrong and worse so confidently wrong. It’s an impressive tool but they still haven’t got that working well, let alone any further “advances”.
And blindly throwing data at LLMs and hoping to stumble on AGI is not going to work - crudely that is the approach of much of the cow boy outfits out there claiming to be innovating in AI. That includes big tech companies who have jumped on the bandwagon over the last 18 months.
The term “Artificial Intelligence” is an umbrella term for a wide range of algorithms and techniques that has been in use by the scientific and engineering communities for over half a century. The term was brought into use by the Dartmouth workshop in 1956. It’s perfectly applicable to LLMs and other similar generative algorithms being used today, and many less sophisticated ones as well. “Artificial general intelligence” is a subset of AI.
The article title makes no sense, and the article itself too.
No one is saying GPT has achieved AGI. Is it a strawman argument?
AGI could end up in a similar predicament: a benchmark, devised by humans, that’s rendered obsolete by the technology it was meant to measure.
Just because we can’t measure it doesn’t mean it is obsolete.
We are approaching another AI Winter. AI goes through hype cycles:
- some flashy new capability captures everyone’s imaginations
- Companies and researchers exaggerate the possibilities and people are led to believe that full AI is right around the corner.
- Then with familiarity people realize the new capability isn’t really that revolutionary, and the term “AI” is distrusted again, for another decade or two.
It’s been that way since the 1950s. Read more about it: https://en.wikipedia.org/wiki/AI_winter
“was powered by the sun, was reliable, five times as fast, twice as easy to drive—but would only run on 5% of the roads.”
Which ones? The ones that go downhill??
There are ethical implications to GAI as well.
I’m so confused by “it’s already here” “what even defines agi” etc. “what do we mean”
Like is it really that hard to understand the difference of a function you call and it returns something
And an autonomous entity
so hard to distinguish?