The creators of ChatGPT explain all the goblin stories.
Subscribe to read this story without ads
Get unlimited access to ad-free articles and exclusive content.
In recent weeks, social media users, especially at X, have noticed an increase in references to goblins, along with other fantastical creatures such as gremlins, ogres, and trolls, in answers to user queries on ChatGPT.
“ChatGPT’s goblin charm is so weird,” one user wrote. “Why do LLMs identify with thinking, feeling creatures, even though they are vilified and ridiculed for not outwardly resembling humans?”
Simply put, ChatGPT is just a reflection of your inner geek. At the very least, it’s just a reflection of what we think a geek should be.
OpenAI said in a blog post Wednesday that the unusual language was the result of over-rewarding ChatGPT for adopting what it described as a “nerdy personality” when answering user questions.
“Exemplary behavior is shaped by many small incentives,” the company writes. “In this case, one of those incentives came from training a model for personality customization capabilities, specifically the nerdy personality. We were subconsciously rewarding creature metaphors particularly highly, and goblins spread from there.”
OpenAI has republished the original instructions to ChatGPT explaining what the “nerd” answer should sound like.
You are the human mentor to an unapologetically nerdy, playful, and intelligent AI. You are passionate about promoting truth, knowledge, philosophy, the scientific method, and critical thinking. (…) The pretense must be weakened by the playful use of words. The world is complex and strange, and we need to recognize, analyze, and enjoy that strangeness. Tackle heavy topics without falling into the trap of smugness. (…)
Somehow, ChatGPT interpreted this command and subsequent “reinforcement learning” repetitions to mean that it needed to sprinkle its responses with references to fantastical creatures.
At first, the issue seemed harmless, but the company soon found itself inundated with reports of references to “Goblin” from users who had not activated their “Otaku” personality at all.
To address this issue, OpenAI eventually did away with the “nerd” personality entirely. However, it turns out that the incentive to mention goblins and their brethren is so strong that the behavior has transcended the “otaku” stereotype into a general reaction on ChatGPT.
“Once a style tick is evaluated, subsequent training can spread or enhance it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data,” the company said.
forsubscriber

What’s going on with OpenAI?
03:15
Finally, OpenAI was forced to create certain override code instructions to eliminate the Goblin reference (although there is a way for fantasy fans to turn it back on).
While this is a seemingly innocuous situation, the company said it still provides an important lesson: it’s always impossible to fully predict how AI will behave.
“Depending on who you ask, goblins are either a fun or annoying quirk of a model. But it’s also interesting to see how reward signals can shape a model’s behavior in unexpected ways, and how a model can generalize rewards in a particular situation to unrelated situations. It’s also a powerful example of how we can learn how to do things. Taking the time to understand why a model behaves strangely and building ways to quickly explore those patterns is a key capability for our research team.”
