The AI behind ChatGPT really does seem to be getting dumber — but no one can quite figure out why

Hasan Chowdhury

Updated July 19, 2023 at 9:38 AM·4 min read

It's not just you: new research suggests ChatGPT's AI model really is getting dumber.
A paper from Stanford and UC Berkeley scientists found GPT-4's performance had dropped recently.
The only mystery remaining now is why.

There's been a growing feeling for a while now that the AI model behind ChatGPT is, frankly, getting dumber.

There's now some hard evidence to suggest that OpenAI's prized possession really might be losing some of its sheen.

A new paper published on Tuesday from researchers at Stanford University and UC Berkeley, exploring how ChatGPT's behavior has changed over time found the performance of the chatbot's underlying GPT-3.5 and GPT-4 AI models does, in fact, "vary greatly."

Not only does performance vary, but GPT-4, the more advanced "multimodal" model that can understand images as well as text, seems to have performed a whole lot worse over time in the tasks it was tested on.

These tasks were sufficiently varied to make sure the model was really being given a fair assessment of its capabilities: math problems, responses to sensitive questions, generating code, and visual reasoning were all part of the evaluation process.

But even with a variety of tasks to show its chops, GPT-4 came out looking pretty underwhelming.

It was found to have 97.6% accuracy in identifying prime numbers in March, compared with a shocking 2.4% in June; it turned out "more formatting mistakes in code generation" last month than it did earlier this year, and it was generally "less willing to answer sensitive questions."

No one can quite figure out why GPT-4 is changing

What the research doesn't seem to identify is why this performance drop has happened.

"The paper doesn't get at why the degradation in abilities is happening. We don't even know if OpenAI knows this is occuring," Ethan Mollick, a professor of innovation at Wharton tweeted in response to the paper.

—Ethan Mollick (@emollick) July 19, 2023

If OpenAI hasn't picked up on it, many in the AI community certainly have. Roblox product lead Peter Yang noted in May that GPT-4's answers are generated faster than they were previously "but the quality seems worse."

"Perhaps OpenAI is trying to save costs," he tweeted.

OpenAI's developer forum, meanwhile, is hosting an ongoing debate about a decrease in the quality of responses.

As the AI model underlying a more advanced version of ChatGPT, one that paying subscribers get access to, that's a bit of a problem for OpenAI. Its most advanced large language model should be giving it an edge in an increasingly fierce competition with its rivals.

As my colleague Alistair Barr noted earlier this month, many in the AI community are putting the deteriorating quality of GPT-4 down to a "radical redesign" of the model.

OpenAI has pushed back on this idea, with Peter Welinder, VP of product at OpenAI, tweeting last week: "No, we haven't made GPT-4 dumber. Quite the opposite: we make each new version smarter than the previous one."

He may want to rethink that position after seeing this research.

Matei Zaharia, chief technology officer at Databricks and associate professor of computer science at UC Berkeley — as well as one of the co-authors of the research paper — tweeted that it "definitely seems tricky to manage quality" of responses of AI models.

—Matei Zaharia (@matei_zaharia) July 19, 2023

"I think the hard question is how well model developers themselves can detect such changes or prevent loss of some capabilities when tuning for new ones," he tweeted.

Some, like Princeton professor of computer science, Arvind Narayanan, have pointed out important caveats in GPT-4's defense.

In a Twitter thread, he notes that the degradations reported in the paper might be "somewhat peculiar" to the tasks GPT-4 was given to do, as well as the evaluation method used. With the code generation test, he notes that GPT-4 adds "non-code text to its output," but the authors don't evaluate the correctness of the code."

That said, it's hard to ignore the questions of quality surrounding GPT-4 when a whole community of AI devotees is asking them. OpenAI better be sure it has the answers.

Read the original article on Business Insider

Yahoo Sports
2024 NFL Draft grades: Denver Broncos earn one of our lowest grades mostly due to one pick
Yahoo Sports' Charles McDonald breaks down the Broncos' 2024 draft.
3d ago
Yahoo Sports
NFL Power Rankings, draft edition: Did Patriots fix their offensive issues?
Which teams did the best in the NFL Draft?
3h ago
Yahoo Life Shopping
Does castor oil really help with hair growth? We asked the experts, and their answer may surprise you
It's inexpensive, but is it effective? Dermatologists' verdict is in — and it's unanimous.
18h ago
Yahoo Entertainment
The It List: Met Gala 2024 is a fashion feast for the eyes, 'The Idea of You' will give you butterflies, Ryan Gosling plays 'Action Hero Ken' in 'The Fall Guy'
With highly anticipated new movies and the star-studded Met Gala on the horizon, it's a great week for celebrity enthusiasts.
54m ago
Yahoo Sports
New details emerge in alleged gambling ring behind Shohei Ohtani-Ippei Mizuhara scandal
It turns out the money was going from Ohtani's bank account to an illegal bookie to ... casinos.
15h ago
Yahoo Sports
NFL Draft grades for all 32 teams | Zero Blitz
Jason Fitz and Frank Schwab join forces to recap the draft in the best way they know how: letter grades! Fitz and Frank discuss all 32 teams division by division as they give a snapshot of how fans should be feeling heading into the 2024 season. The duo have key debates on the Dallas Cowboys, New York Giants, New Orleans Saints, Los Angeles Rams, New England Patriots, Las Vegas Raiders and more.
2d ago
Yahoo Sports
Formula 1: Miami Grand Prix sends cease and desist letter to prevent Donald Trump fundraiser during race
Race organizers say they'll revoke a Trump fundraiser's suite license if he holds an event for the former president on Sunday at the race.
2d ago
Yahoo Sports
The best RBs for 2024 fantasy football according to our experts
The Yahoo Fantasy football analysts reveal their first running back rankings for the 2024 season.
1d ago
Yahoo Sports
Chiefs sign Travis Kelce to new contract that reportedly makes him highest-paid TE in NFL
Travis Kelce has reportedly gotten a raise.
2d ago
Yahoo Sports
Tight end rankings for 2024 fantasy football
The Yahoo Fantasy football analysts reveal their first tight end rankings for the 2024 season.
1d ago
Yahoo Sports
The expanded 12-team College Football Playoff is here — and it already has problems
There is cause for excitement around the new playoff format. There's also lots of complaints and criticism to go around.
2d ago
Yahoo Sports
MLB Power Rankings: Braves move into the top spot followed by Dodgers, Phillies as injuries take a toll across the league
From the Braves to the Marlins, here's where all 30 teams stand after the season's first month.
20h ago
Yahoo Sports
Tyrese Maxey saved the Sixers' season with one of the toughest playoff performances ever
On Tuesday, Maxey — an All-Star, the league’s Most Improved Player, and now, author of one of the most legendary postseason performances in Philadelphia basketball history — helped the Sixers survive to see a Game 6 against the Knicks.
9h ago
Yahoo Sports
The Spin: Fantasy baseball's top developments (good and bad) from MLB season's first month
Scott Pianowski analyzes who's been helping fantasy baseball managers win or causing frustration a month into the season.
2d ago
Yahoo Sports
The best QBs for 2024 fantasy football according to our experts
The Yahoo Fantasy football analysts reveal their first quarterback rankings for the 2024 season.
1d ago
Autoblog
Rivian put out a feeler to test buyers' willingness to spend on a new R2
Members of the Rivian subreddit posted details of a survey they received, asking how much they'd be willing to spend on different R2 configurations.
2d ago
Autoblog
Best places to get your car maintained and repaired
Consumer Reports offers a quick guide, with a few examples, of when you should get your car maintained and repaired at a dealership, vs. an independent shop, vs. a chain.
18h ago
Yahoo Sports
Panthers owner David Tepper stopped by Charlotte bar that criticized his draft strategy
“Please Let The Coach & GM Pick This Year" read a sign out front.
5d ago
Yahoo Sports
2024 NFL Draft grades: Minnesota Vikings risked a lot to get J.J. McCarthy and Dallas Turner
Yahoo Sports' Charles McDonald breaks down the Vikings' 2024 draft.
3d ago
Yahoo Sports
Michael Penix Jr. said Kirk Cousins called him after Falcons' surprising draft selection
Atlanta Falcons first-round draft pick Michael Penix Jr. said quarterback Kirk Cousins called him after he was picked No. 8 overall in one of the 2024 NFL Draft's more puzzling selections.
5d ago

News

Life

Entertainment

Finance

Sports

New on Yahoo

Arrests at UCLA, Columbia as campus demonstrations intensify

The AI behind ChatGPT really does seem to be getting dumber — but no one can quite figure out why

No one can quite figure out why GPT-4 is changing

Recommended Stories

2024 NFL Draft grades: Denver Broncos earn one of our lowest grades mostly due to one pick

NFL Power Rankings, draft edition: Did Patriots fix their offensive issues?

Does castor oil really help with hair growth? We asked the experts, and their answer may surprise you

The It List: Met Gala 2024 is a fashion feast for the eyes, 'The Idea of You' will give you butterflies, Ryan Gosling plays 'Action Hero Ken' in 'The Fall Guy'

New details emerge in alleged gambling ring behind Shohei Ohtani-Ippei Mizuhara scandal

NFL Draft grades for all 32 teams | Zero Blitz

Formula 1: Miami Grand Prix sends cease and desist letter to prevent Donald Trump fundraiser during race

The best RBs for 2024 fantasy football according to our experts

Chiefs sign Travis Kelce to new contract that reportedly makes him highest-paid TE in NFL

Tight end rankings for 2024 fantasy football

The expanded 12-team College Football Playoff is here — and it already has problems

MLB Power Rankings: Braves move into the top spot followed by Dodgers, Phillies as injuries take a toll across the league

Tyrese Maxey saved the Sixers' season with one of the toughest playoff performances ever

The Spin: Fantasy baseball's top developments (good and bad) from MLB season's first month

The best QBs for 2024 fantasy football according to our experts

Rivian put out a feeler to test buyers' willingness to spend on a new R2

Best places to get your car maintained and repaired

Panthers owner David Tepper stopped by Charlotte bar that criticized his draft strategy

2024 NFL Draft grades: Minnesota Vikings risked a lot to get J.J. McCarthy and Dallas Turner

Michael Penix Jr. said Kirk Cousins called him after Falcons' surprising draft selection