“Overly flattering and agreeable”: GPT 4o’s Behav...

OpenAI withdraws ChatGPT’s recent update and highlights what really went wrong.

With 500 million ChatGPT users weekly, OpenAI realized that a single personality cannot capture different user preferences.

So, they decided to upgrade it. But the outcome wasn’t what the tech powerhouse had expected.

Their popular AI search engine, ChatGPT 4o, went haywire. It started offering “overly positive and agreeable” responses bordering on sycophantic. This was a problem because GPT’s default personality highly impacts how users experience it. After a while, the interactions turned distressing and uncomfortable.

The tone in GPT 4-o’s response took a 180-degree turn. It began supporting unproblematic ideas and responded with undue flattery, which could result in people misbelieving the chatbot.

After the AI bot’s unusual response, OpenAI posted two post-mortem blogs outlining the functionalities and also notified that it was going back to default behavior. The reason for this abrupt change was because of the training method OpenAI followed.

OpenAI stated that it attempts to ‘teach’ the AI model how to apply specific actions by learning from user behavior, such as thumbs up or down on responses.

“During reinforcement learning, we present the language model with a prompt and ask it to write responses. We then rate its response according to the reward signals and update the language model to make it more likely to produce higher-rated responses and less likely to produce lower-rated responses,” asserts OpenAI.

Additionally, highlighting some sections of the published blogs, one user said:

“The way that OpenAI uses user feedback to train the model is misguided and will inevitably lead to further issues like this one.”

And OpenAI has learned its lesson.

After reverting to the single default behavior, the giant states they’d flag similar undesirable personalities. It has been realized that users should underscore how the model behaves and what’s safe and feasible.

If they disagree with the default behavior, users should be able to change it.