Anthropic introduces Claude 3 and outperforms GPT-4 and Gemini Ultra in benchmark tests

March 4, 2024

108

Anthropocenea number one artificial intelligence startup, introduced its Claude 3 Today, there are numerous AI models designed to satisfy the varied needs of enterprise customers with a balance of intelligence, speed and cost-effectiveness. The range includes three models: Opus, Sonnet and the upcoming Haiku.

The star of the lineup is opuswhich, in line with Anthropic, is more powerful than every other openly available AI system in the marketplace, even outperforming the leading models from competitors OpenAI and Google.

“Opus is suitable for a wide selection of tasks and performs them exceptionally well,” said Anthropic co-founder and CEO Dario Amodei in an interview with VentureBeat.

Amodei explained that Opus outperforms top AI models reminiscent of GPT-4, GPT-3.5 and Gemini Ultra in quite a lot of benchmarks. This also includes academic benchmarks reminiscent of: B. to be at the highest of the leaderboard GSM 8k for mathematical considering and MMLU for expert knowledge.

“It seems to outperform everyone and produce results on some tasks that we now have never seen before,” Amodei said.

Photo credit: Anthropic

While corporations like Anthropic and Google haven’t disclosed the total parameters of their leading models, each corporations' reported benchmark results suggest that Opus either meets or exceeds key alternatives like GPT-4 and Gemini by way of core features.

At least on paper, this represents a brand new high for commercially available conversational AI.

Designed for complex tasks that require advanced considering, Opus stands out in Anthropic's product range for its superior performance.

Fast mid-range options can be found

Sonnet, the mid-range model, offers corporations a more cost effective solution for routine data evaluation and knowledge work while offering high performance without the upper price of the flagship model.

Meanwhile, Haiku is designed to be fast and economical, suitable for applications reminiscent of consumer-facing chatbots where responsiveness and value are critical aspects.

Amodei told VentureBeat he expects Haiku to be released publicly inside “weeks, not months.”

New visual features open up recent use cases

Each of the models introduced today supports image input, a feature that is especially in demand for applications reminiscent of text recognition in images.

“We haven't focused a lot on the issuance modalities because there may be less demand for it on the enterprise side,” Daniela Amodei, president and co-founder of Anthropic, told VentureBeat, emphasizing the corporate's strategic deal with the features most requested by enterprises.

Additionally, the Claude 3 models feature advanced computer vision capabilities which might be on par with other state-of-the-art models. This recent modality opens up use cases where corporations must extract information from images, documents, charts and graphs.

“Lots of (customer) data is either highly unstructured or in some form of visual format,” explains Daniela. “Just the means of having to manually copy that information to even have it interact with a generative AI tool is sort of cumbersome.”

Areas reminiscent of legal services, financial evaluation, logistics and quality assurance may benefit from AI systems that understand real-world images and texts alike.

The tightrope walk of bias in AI

Anthropic's announcement follows the controversy surrounding Google's recent chatbot Twinswhich highlighted the difficulties technology corporations face in releasing models that avoid perpetuating social biases.

Last week, people discovered that asking twins to create historical images resulted in depictions that appeared to overcorrect racist depictions. For example, asking for images of Vikings or Nazi soldiers produced images of racially diverse groups that likely didn’t reflect historical reality.

Google responded by disabling Gemini's image generation features and apologizing, saying it had “missed the mark” in attempting to increase diversity. But experts say the situation highlights the constant balancing act surrounding bias in AI.

Constitutional AI helps, but shouldn’t be perfect

Anthropic co-founder Dario Amodei highlighted the issue of controlling AI models in his interview with VentureBeat, calling it an “inexact science.” He said the corporate has teams dedicated to assessing and mitigating various risks of its models.

“Our hypothesis is that being on the forefront of AI development is probably the most effective method to steer the course of AI development toward a positive end result for society,” Dario said.

However, Daniela Amodei, co-founder of Anthropic, admitted that completely unbiased AI might be not achievable using current methods.

“I believe it's almost unimaginable to develop a very neutral, generative AI tool, each technically and since not everyone agrees on what neutral is,” she said.

Part of Anthropic's strategy is an approach called Constitutional AI, wherein models are aligned to follow principles defined in a “structure.” But Dario Amodei admits that even this method shouldn’t be perfect.

“We strive for the models to be fair and ideologically and politically neutral, (but) you realize, we haven't gotten it perfect,” he said. “I don’t think, you realize, anyone got it perfect.”

Still, Dario believes Anthropic's constructing on widely accepted values helps prevent models from being skewed toward a partisan agenda, contrary to the accusations Gemini faces.

“Our goal shouldn’t be to represent a specific political or ideological standpoint,” he said. “We want our models to be suitable for everybody.”

Anthropic introduces Claude 3 and outperforms GPT-4 and Gemini Ultra in benchmark tests

Fast mid-range options can be found

New visual features open up recent use cases

The tightrope walk of bias in AI

Constitutional AI helps, but shouldn’t be perfect

LEAVE A REPLY Cancel reply

Must Read

A brand new Chinese video generation model appears to censor politically sensitive topics

OpenAI pronounces “SearchGPT” to remain at the highest

How Salesforce's STEM 1T dataset could revolutionize the AI industry

Forget coding bootcamps: Airtable's AI can construct your app in seconds

Level AI applies algorithms to the weak points within the contact center

ChatGPT: Everything you have to know concerning the AI-powered chatbot

Breakthroughs in artificial intelligence create a brand new ‘brain’ for advanced robots

Latest articles

A brand new Chinese video generation model appears to censor politically sensitive topics

OpenAI pronounces “SearchGPT” to remain at the highest

How Salesforce's STEM 1T dataset could revolutionize the AI industry

Our Newsletter

Anthropic introduces Claude 3 and outperforms GPT-4 and Gemini Ultra in benchmark tests

Fast mid-range options can be found

New visual features open up recent use cases

The tightrope walk of bias in AI

Constitutional AI helps, but shouldn’t be perfect

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter