Meta releases Llama 3.1 models and sticks to the open strategy

July 24, 2024

132

Meta has released its updated Llama 3.1 models in versions 8B, 70B and 405B, committing to Mark Zuckerberg's open source vision for the long run of AI.

The latest additions to Meta's Llama model family have an prolonged context length of 128 KB and support eight languages.

Meta says its highly anticipated Model 405B “delivers unmatched flexibility, control, and cutting-edge capabilities that rival the very best closed-source models.” It also claims that Llama 3.1 405B is “the world's largest and strongest freely available base model.”

Given the horrendous computational costs involved in training ever larger models, there was much speculation that Meta's flagship 405B model might be the primary paid model.

Llama 3.1 405B was trained with over 15 trillion tokens and 16,000 NVIDIA H100s, likely costing a whole lot of thousands and thousands of dollars.

In a blog entryMeta CEO Mark Zuckerberg reiterated the corporate's view that open source AI is the best way forward and that the discharge of Llama 3.1 is the subsequent step “on the trail to open source AI becoming the industry standard.”

The Llama 3.1 models will be downloaded free of charge and modified or optimized using plenty of services from Amazon, Databricks and NVIDIA.

The models are also available from cloud service providers equivalent to AWS, Azure, Google and Oracle.

Starting today, open source leads the best way. We're introducing Llama 3.1: our strongest models yet.

Today we’re releasing a group of recent Llama 3.1 models, including our long-awaited 405B. These models offer improved reasoning capabilities, a bigger 128K token context… pic.twitter.com/1iKpBJuReD

— AI at Meta (@AIatMeta) July 23, 2024

Performance

Meta says it has tested its models on over 150 benchmark datasets and published results for the more common benchmarks to indicate how its latest models compare to other leading models.

Llama 3.1 405B differs little from GPT-4o and Claude 3.5 Sonnet. Here are the numbers for the 405B model and the smaller versions 8B and 70B.

Benchmark comparison of the Llama 3.1 405B with other leading models. Source: Meta

Meta also conducted “extensive human evaluations comparing Llama 3.1 to competing models in real-world scenarios.”

These numbers are about users deciding whether or not they prefer the response of 1 model or the opposite.

The human evaluation of Llama 3.1 405B reflects an identical parity, because the benchmark numbers show.

Llama 3.1 405B evaluation leads to humans compared with GPT-4, GPT-4o and Claude 3.5 Sonnet. Source: Meta

Meta says its model is actually open, as even the model weights from Llama 3.1 can be found for download, although the training data has not been released. The company has also modified its license to permit using Llama models to enhance other AI models.

The freedom to tweak, modify, and use Llama models without restrictions will set off alarm bells amongst critics of open-source AI.

Zuckerberg argues that an open-source approach is the very best technique to avoid unintended harm. If an AI model is open to scrutiny, it’s less more likely to develop dangerous emergent behavior that we might otherwise miss in closed models.

Regarding the potential for intentional harm, Zuckerberg says: “As long as everyone has access to similar generations of models – which open source encourages – governments and institutions with more computing resources will give you the option to regulate malicious actors with less computing power.”

Zuckerberg addresses the chance that rival states like China could gain access to Meta's models and says that each one efforts to maintain them out of China's hands will fail.

“Our adversaries are masters of espionage. Stealing models that fit on a USB stick is comparatively easy, and most technology corporations are removed from making this tougher,” he explained.

The excitement about an open-source AI model like Llama 3.1 405B taking over the massive closed models is justified.

But with rumors of GPT-5 and Claude 3.5 Opus already within the pipeline, these benchmark results won’t age particularly well.

Meta releases Llama 3.1 models and sticks to the open strategy

Performance

LEAVE A REPLY Cancel reply

Must Read

Power infrastructure is the following game for AI investors

Generative AI startup Typeface acquires two corporations, Treat and Narrato, to strengthen its portfolio

01 is more intelligent, but more misleading and has a “medium” danger level

How to enable OpenAI's latest o1 models

BHP warns: AI growth will exacerbate copper shortage

Australia's latest fraud prevention blueprint is welcome – but its scope must be broader

Is AI the long run of sales? Salesforce's latest models could change the foundations of the sport

Latest articles

Power infrastructure is the following game for AI investors

Generative AI startup Typeface acquires two corporations, Treat and Narrato, to strengthen its portfolio

01 is more intelligent, but more misleading and has a “medium” danger level

Our Newsletter

Meta releases Llama 3.1 models and sticks to the open strategy

Performance

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter