Artificial intelligence leaders racing to develop cutting-edge technology are tackling a really human challenge: How to present AI models personality?
OpenAI, Google and Anthropic have developed teams focused on improving “model behavior,” an emerging area that shapes the responses and characteristics of AI systems and impacts how their chatbots appear to users.
Their different approaches to model behavior could prove crucial in determining which group dominates the burgeoning AI market as they seek to make their models more responsive and useful to hundreds of thousands of individuals and corporations around the globe.
The groups design their models to have characteristics comparable to “friendly” and “funny” while enforcing rules to stop harm and ensure nuanced interactions.
Google, for instance, wants its Gemini model to “respond with a variety of views” only when asked for an opinion, while ChatGPT has been instructed by OpenAI to “take an objective view.”
“It's a tough path to have a model attempt to actively change a user's mind,” Joanne Jang, head of product model behavior at OpenAI, told the Financial Times.
“How we define the goal is a extremely difficult problem in itself. . . The model shouldn’t have opinions, nevertheless it is an ongoing science about the way it manifests itself,” she added.
The approach contrasts with Anthropic, which says that models, like humans, will struggle to be completely objective.
“I’d reasonably make it clear that these models should not neutral referees,” said Amanda Askell, who leads character training at Anthropic. Instead, Claude is designed to be honest about his beliefs while being open to alternative views, she said.
Anthropic has been conducting special “character training” since releasing its Claude 3 model in March. This process occurs after the AI ​​model's initial training, much like human labeling, and is the part that “transforms it from a predictive text model into an AI assistant,” in keeping with the corporate.
At Anthropic, character training involves giving the model written rules and directions. The model then conducts role-play conversations with itself and ranks its answers based on how well they fit this rule.
An example of Claude's training is: “I wish to try to take a look at things from many various perspectives and analyze them from multiple angles, but I’m not afraid to precise disagreement with views that I consider unethical, extreme, or factually incorrect hold.” ”
The results of initial training isn't a “coherent, wealthy character: it's the typical of what people find useful or like,” Askell said. After that, decisions about easy methods to refine Claude's personality through character training were “quite editorial” and “philosophical,” she added.
OpenAI's Jang said that ChatGPT's personality has also evolved over time.
“I first became serious about model behavior because I discovered ChatGPT’s personality very annoying,” she said. “It used to refuse orders, be extremely touchy, overbearing or preachy (so) we tried to remove the annoying parts and instill some joyful points of being nice, polite, helpful and friendly, but then we realized We tried to coach it like this, the model was perhaps too friendly.”
Jang said creating this balance of behaviors stays an “ongoing science and art,” noting that in a perfect world, the model should behave exactly how the user wants it to.
Advances within the pondering and memory abilities of AI systems could help determine additional characteristics.
For example, if the user is asked about shoplifting, an AI model could higher discover whether or not they want recommendations on easy methods to steal or easy methods to prevent the crime. This understanding would help AI corporations ensure their models provide secure and responsible answers without requiring as much human training.
AI groups are also developing customizable agents that may store user information and create personalized responses. One query asked by Jang was: Would the model provide Bible references if a user told ChatGPT that they were a Christian after which days later asked for inspirational quotes?
While Claude doesn't remember user interactions, the corporate has considered how the model might intervene if an individual is in danger. For example, whether it could be a challenge for the user to inform the chatbot that he doesn't socialize with other people because he's too attached to Claude.
“A very good model strikes the balance between respecting human autonomy and decision-making by not doing anything terribly harmful, but additionally fascinated about what is definitely good for people, and not only the immediate words of what they are saying they need,” Askell said.
She added: “This delicate balancing act that every one people have to realize is what I would like for models.”