Hidden costs for AI provision: Why Claude models in corporate settings are 20-30% dearer than GPT

May 3, 2025

134

It is a known undeniable fact that different model families can use different tokenizers. However, it was only analyzed to a limited extent how the technique of “Tokenization“” Even varies over these tokenizers. Do all tokenizers run in the identical variety of tokens for a selected input text? If not, how different are the tokens generated? How significant are the differences?

In this text we examine these questions and examine the sensible effects of the variability of tokenization. We present a comparative story of two border model families: Openai'S Chatgpt vs AnthropicClaude. Although their announced “inexpensive” numbers are very competitive, experiments show that anthropic models could be 20–30% dearer than GPT models.

API price design-Claude 3.5 Sonet against GPT-4O

From June 2024, the value structure for these two advanced frontier models may be very competitive. Both the Claude 3.5 Bonnet from Anthropic and the GPT-4O from Openaai have similar costs for output tokens, while Claude 3.5 Sonnet offers 40% cheaper costs for input token.

The hidden “tokenizer -inefficiency”

Despite the lower input token rates of the anthropic model, we observed that the whole costs for ongoing experiments (with a certain set of firm input requests) with GPT-4O are less expensive in comparison with Claude-Sonnet-3.5.

Why?

The anthropic tokenizer tends to disassemble the identical input into more tokens than with Openai's tokenizer. This signifies that anthropic models produce considerably more tokens for similar requests than their Openai counterparts. While the pro-member costs for Claude 3.5 Sonett-Input could also be lower, increased tokenization can compensate for these savings, which results in higher total costs in practical applications.

These hidden costs are based in the way in which anthropics tokenizer codes information and infrequently use more tokens to display the identical content. The inflation of the token count has a major impact on the prices and using the context window.

Domain -dependent tokenization efficiency

Different kinds of domain contents are token in a different way by Anthropics Tokenizers, which results in different numbers of the tokens in comparison with Openai models. The AI research community has found similar tokenization differences Here. We tested our ends in three popular domains, namely: English articles, code (Python) and arithmetic.

When comparing Claude 3.5 sonet with GPT-4O, the degree of tokenizer inefficiency varies significantly between the content domains. For English articles, Claude's tokenizer produces roughly 16% more tokens than GPT-4O for a similar entrance text. This overhead increases with a structured or technical content: for mathematical equations, the overhead is 21% and Claude generates 30% more tokens for Python code.

This variation arises because some content types resembling technical documents and code often contain patterns and symbols that put together anthropics tokenizer fragments into smaller pieces, which results in a better variety of tokens. In contrast, the content of the natural language tends to indicate a lower token -overhead.

Other practical implications of the inefficiency of the tokenizer

In addition to the direct effects on the prices, indirect effects on using the context window also has an indirect impact. While anthropical models have a bigger context window of 200k token in contrast to Openais 128 -k -token resulting from detail, the effective usable token space for anthropic models could be smaller. Therefore, there could possibly be a small or big difference within the “advertised” context window sizes in comparison with the “effective” context window sizes.

Implementation of tokenizers

Use GPT models Byta pair coding (BPE )))the usually contiguous pairs of characters merge to tokens at the identical time. In particular, the most recent GPT models use the open source O200K_Base tokenizer. The actual tokens utilized by GPT-4O (within the Tikoke tokenizer) Here.

JSON
 
{
    #reasoning
    "o1-xxx": "o200k_base",
    "o3-xxx": "o200k_base",

    # chat
    "chatgpt-4o-": "o200k_base",
    "gpt-4o-xxx": "o200k_base",  # e.g., gpt-4o-2024-05-13
    "gpt-4-xxx": "cl100k_base",  # e.g., gpt-4-0314, etc., plus gpt-4-32k
    "gpt-3.5-turbo-xxx": "cl100k_base",  # e.g, gpt-3.5-turbo-0301, -0401, etc.
}

Unfortunately, not much could be said about anthropic tokenizers, for the reason that tokenizer isn’t as direct and straightforward as GPT is offered. Anthropic Published their token Counting Api in December 2024. However, it was soon switched off within the later versions in 2025.

Late red reports that “Anthropic uses a novel tokenzace with only 65,000 token variations in comparison with the 100.261 token variations from Openai for GPT-4”. The Colab notebook Contains Python code to investigate the tokenization differences between GPT and Claude models. Another Tool This enables networking with some common, publicly available tokenizers who confirm our results.

The ability to proactive token counts (without the actual model -api) and the budget costs for AI firms of crucial importance.

Key Takeaways

Anthropic's competitive pricing is related to hidden costs:
While the Claude 3.5 sonet from Anthropic offers 40% lower input token costs in comparison with GPT-4O from Openaai, this obvious cost advantage could be misleading resulting from the differences in the way in which through which the input text is token.
Hidden “inefficiency of the tokenizer”:
Anthropic models are naturally more detailed. For firms that process large text volumes, the understanding of this discrepancy is of crucial importance when evaluating the actual costs for the supply of models.
Domain-dependent tokenizer inefficiency:
When selecting between Openai and Anthropic models, Rate the form of input text. In the case of tasks of the natural language, the associated fee difference could be minimal, but technical or structured domains can result in significantly higher costs with anthropic models.
Effective context window:
Due to the detail of anthropics tokenizer, its larger 200 -K context window can offer less effectively usable space than Openai's 128k, which results in one potential Clepch between advertised and actual context window.

Anthropic didn’t reply to the inquiries from Venturebeat after the press time. We will update the story if you happen to answer.

domain	Model input	GPT -TOKEN	Claude tokens	% Token -overhead
English article		77	89	~ 16%
Code (python)		60	78	~ 30%
math		114	138	~ 21%

Hidden costs for AI provision: Why Claude models in corporate settings are 20-30% dearer than GPT

API price design-Claude 3.5 Sonet against GPT-4O

The hidden “tokenizer -inefficiency”

Domain -dependent tokenization efficiency

Other practical implications of the inefficiency of the tokenizer

Implementation of tokenizers

Key Takeaways

LEAVE A REPLY Cancel reply

Must Read

Mindminimalism: The recent AI strategy saves tens of millions

A biological computer grow within the British laboratory

The inference trap: How cloud providers eat their AI margins

The increase within the fast ops: accused of hidden AI costs from bad inputs and context blue

Problem at work? You will hear from my chatbot

Petlibro's recent intelligent camera uses KI to explain the movements of your pet and it’s enchanting

Scaling smarter: How enterprise IT teams can right-size their compute for AI

Latest articles

Mindminimalism: The recent AI strategy saves tens of millions

A biological computer grow within the British laboratory

The inference trap: How cloud providers eat their AI margins

Our Newsletter

Hidden costs for AI provision: Why Claude models in corporate settings are 20-30% dearer than GPT

API price design-Claude 3.5 Sonet against GPT-4O

The hidden “tokenizer -inefficiency”

Domain -dependent tokenization efficiency

Other practical implications of the inefficiency of the tokenizer

Implementation of tokenizers

Key Takeaways

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter