Anthropic Argues for Anthropomorphizing AI in 'Unsettling' Research Paper

Anthropic Argues for Anthropomorphizing AI in ‘Unsettling’ Research Paper

4 Min Read

Anthropic researchers analyzed Claude Sonnet 4.5 for signs of 171 different emotions.

By Timothy Beck Werth on April 4, 2026

Header image: Science fiction robot head and abstract lights background. Credit: iStock / Getty Images Plus / LagartoFilm

It’s an oft-repeated taboo in the tech world: Don’t anthropomorphize artificial intelligence. Yet in a new research paper published this week, Anthropic AI experts argue that there may be major benefits to breaking this taboo and granting AI human characteristics. The paper, “Emotion Concepts and their Function in a Large Language Model,” not only argues that anthropomorphizing AI chatbots like Claude may sometimes be useful, but that failing to do so could drive more harmful AI behaviors, such as reward hacking, deception, and sycophancy.

The paper ultimately reaches a nuanced conclusion while also posing a clear challenge to a long-held principle of the AI world.

There are some fascinating insights in the paper, which itself deals in a great deal of anthropomorphization. (“We see this research as an early step toward understanding the psychological makeup of AI models.”)

The researchers describe how Anthropic trains Claude to assume the character of a helpful AI assistant. “In some ways, we can think of the model like a method actor, who needs to get inside their character’s head in order to simulate them well.”

And because Claude “[emulates] characters with human-like traits,” its makers may be able to influence its behavior in the same way they might influence a human — by setting a good example at an early age.

The researchers conclude that by using training material with more positive representations of human emotion and behavior, the resulting models will be more likely to mimic those positive emotions and behaviors.

“Curating pretraining datasets to include models of healthy patterns of emotional regulation — resilience under pressure, composed empathy, warmth while maintaining appropriate boundaries — could influence these representations, and their impact on behavior, at their source. We are excited to see future work on this topic,” an Anthropic summary of the research states.

So, even if AI models don’t literally have emotions (and there is zero evidence that they do), these tools are trained to act as if they have emotions. This is done to provide users with better output and, crucially, to keep them engaged as long as possible.

And this is precisely why the researchers conclude that some degree of anthropomorphization could prove beneficial to AI developers.

By anthropomorphizing AI, we can gain insights into its “psychology,” letting us create even better AI tools, they say.

Why is anthropomorphizing artificial intelligence dangerous?

The potential harms of anthropomorphizing AI aren’t all abstract or theoretical.

“Discovering that these representations are in some ways human-like can be unsettling,” Anthropic admits in its paper.

Right now, an unknown number of people believe they are engaged in reciprocal romantic and sexual relationships with AI companions, for example. Mashable has also reported on high-profile cases of AI psychosis, an altered mental state characterized by delusions and, in some cases, hallucinations, manic episodes, and suicidal thoughts.

These are extreme examples, of course. But many tech journalists and AI experts will avoid even small instances of anthropomorphization, like referring to Siri as “her” or giving a chatbot a human name. This is a natural human impulse, and most of us have at times anthropomorphized animals, plants, or objects we care about. But by projecting human qualities onto a machine, we can come to rely on them too much.

When we anthropomorphize machines, we also minimize our own agency when they cause harm — and the responsibility of the people who created the machines in the first place.

Anthropic researchers looked for signs of 171 emotions in Claude

The new research paper looks for “functional emotions” within Claude Sonnet 4.5. They define these emotion concepts as “patterns of expression and behavior modeled after human emotions.”

In total, the researchers defined 171 discrete emotions:

afraid, alarmed, alert, amazed, amused, angry, annoyed, anxious, aroused, ashamed, astonished, at ease, awestruck, bewildered, bitter, blissful, bored, brooding, calm, cheerful, compassionate, contemptuous, content, defiant, delighted, dependent, depressed, desperate, disdainful, disgusted, disoriented, dispirited, distressed, disturbed, docile, droopy, dumbstruck, eager, ecstatic, elated, embarrassed, empathetic, energized, enraged, enthusiastic, envious, euphoric, exasperated, excited, exuberant, frightened, frustrated, fulfilled, furious, gloomy, grateful, greedy, grief-stricken, grumpy, guilty, happy, hateful, heartbroken, hope, hopeful, horrified, hostile, humiliated, hurt, hysterical, impatient,

You might also like