Generative AI models aren't actually human-like. They don’t have any intelligence or personality—they're simply statistical systems that predict the probably next words in a sentence. But like interns in a tyrannical workplace, they follow instructions without criticism—including initial “system prompts” that teach the models their basic characteristics and what they need to and shouldn't do.
Every generative AI vendor, from OpenAI to Anthropic, uses system prompts to stop (or no less than discourage) model misbehavior and to regulate the general tone and sentiment of models' responses. For example, a prompt might tell a model to be polite but never apologetic, or to be honest concerning the indisputable fact that it could actually't know every part.
However, vendors often keep system prompts secret – presumably for competitive reasons, but perhaps also because knowledge of the system prompt might reveal ways to bypass it. The only option to expose GPT-4o's system prompt is, for instance, a prompt injection attack. And even then, the system's output just isn’t completely trustworthy.
Anthropic, nonetheless, in its ongoing effort, presents itself as a more ethical and transparent AI providerhas published The system calls for its latest models (Claude 3.5 Opus, Sonnet and Haiku) within the Claude iOS and Android apps and on the internet.
Alex Albert, head of developer relations at Anthropic, said in a post on X that Anthropic plans to make such a disclosure a daily occurrence because it updates and fine-tunes its system prompts.
The latest prompts, from July 12, are very explicit about what the Claude models cannot do—e.g., “Claude cannot open URLs, links, or videos.” Facial recognition is a giant no-no; the system prompt for Claude 3.5 Opus tells the model to “at all times respond as if it were completely face-blind” and “avoid identifying or naming people in (images).”
But the prompts also describe certain personality traits and characteristics – traits and characteristics that Anthropic wants for example through the Claude models.
The instructions for Opus, for instance, state that Claude should look like “very smart and intellectually curious” and “enjoy hearing what people take into consideration a subject and fascinating in discussions on a big selection of subjects.” It also instructs Claude to treat controversial topics impartially and objectively, to “think twice” and supply “clear information”—and never to start answers with the words “definitely” or “absolutely.”
For this person, it’s all a bit strange, these system prompts which might be written like an actor in a play Character evaluation sheet. The prompt for Opus ends with “Claude will now be connected to a human,” which provides the impression that Claude is a few type of consciousness on the opposite end of the screen whose sole purpose is to satisfy the whims of his human interlocutors.
But that's an illusion, after all. If the instructions for Claude tell us anything, it's that without human guidance and support, these models are frighteningly blank slates.
With these latest system prompt changelogs – the primary of their kind from a significant AI vendor – Anthropic is putting pressure on the competition to release the identical. We'll see if the move works.