HomeArtificial IntelligenceMeta's Llama AI models now also support images

Meta's Llama AI models now also support images

Benjamin Franklin once wrote that nothing is definite except death and taxes. Let me change this sentence to reflect the present AI gold rush: Nothing is definite except death, taxes, and latest AI models, with the last of those three arriving at an ever-increasing rate.

Earlier this week, Google released updated Gemini models and earlier this month OpenAI unveiled its o1 model. But on Wednesday it was Meta's turn to indicate off its latest developments at the corporate's annual Meta Connect 2024 developer conference in Menlo Park.

Lama's multimodality

Meta's multilingual Llama model family has reached version 3.2, with the change from 3.1 meaning that several Llama models are actually multimodal. Llama 3.2 11B – a compact model – and 90B, a bigger, more powerful model, can interpret charts and graphs, label images, and locate objects in images based on a straightforward description.

For example, using a map of a park, Llama 3.2 11B and 90B could answer questions resembling “When does the terrain turn out to be steeper?” and “How far-off is that this path?” Or if a chart shows an organization’s sales over the course of a 12 months, the models could quickly highlight the group’s best-performing months.

For developers who wish to use the models exclusively for text applications, Meta says Llama 3.2 11B and 90B were designed as “drop-in” replacements for 3.1. 11B and 90B will be deployed with or with no latest security tool, Llama Guard Vision, designed to detect potentially harmful (i.e., biased or toxic) text and pictures fed to or generated by the models.

In most parts of the world, Llama multimodal models will be downloaded and used on quite a lot of cloud platforms, including Hugging Face, Microsoft Azure, Google Cloud and AWS. Meta also hosts them on the official Llama website Llama.com and uses them to power its AI assistant Meta AI on WhatsApp, Instagram and Facebook.

Photo credit: Meta

But Llama 3.2 11B and 90B can’t be accessed in Europe. As a result, several meta-AI features available elsewhere, resembling: B. image evaluation, disabled for European users. Meta Once again blamed the “unpredictable” nature of the Union’s regulatory environment.

Meta has expressed concerns about – and spurned a voluntary security pledge related to the AI ​​Act, the EU law that sets out a legal and regulatory framework for AI. Among other requirements, the AI ​​law requires firms developing AI within the EU to undertake to find out whether their models are more likely to be utilized in “high risk” situations resembling police work. Meta fears that the “open” nature of its models, which provides little insight into how the models are used, could make it difficult to comply with AI Act rules.

For Meta, provisions of the GDPR, the EU's comprehensive data protection law, regarding AI training are also controversial. Meta trains models on the general public data of Instagram and Facebook users who haven’t opted out – data that’s subject to GDPR guarantees in Europe. EU regulators earlier this 12 months asked Meta to stop training on European user data while they assessed the corporate's GDPR compliance.

Meta gave in and at the identical time advocated one open letter calls for a “modern interpretation” of the GDPR that “doesn’t reject progress.”

Earlier this month, Meta announced that it might resume training on UK user data after “incorporating regulatory feedback” right into a revised opt-out process. However, the corporate has not yet announced an update on its training in the remaining of the EU.

More compact models

More latest Llama models – models not trained on European user data – are launching in Europe (and globally) on Wednesday.

Llama 3.2 1B and 3B, two lightweight text-only models designed to run on smartphones and other edge devices, will be used for tasks resembling summarizing and rewriting paragraphs (e.g. in an email). turn out to be. Optimized for Arm hardware from Qualcomm and MediaTek, the 1B and 3B can even access tools like calendar apps with just a little configuration, Meta said, allowing them to take motion autonomously.

There is not any successor, multimodal or not, to the flagship Llama 3.1 405B model released in August. Given the large size of 405B – ​​training took months – this is probably going because of limited computing resources. We've asked Meta if there are other aspects at play and can update this story if we hear anything.

Meta's latest Llama Stack, a set of Llama-focused development tools, will be used to fine-tune all Llama 3.2 models: 1B, 3B, 11B and 90B. Regardless of how they’re customized, the models can process as much as about 100,000 words at a time, Meta says.

Meta Flame 3.2
Photo credit: Meta

A play to exchange ideas

Meta CEO Mark Zuckerberg often talks about ensuring everyone has access to the “advantages and opportunities” of AI. However, implicit on this rhetoric is the will for these tools and models to return from meta.

Spending on models that may then be commercialized forces competitors (e.g. OpenAI, Anthropic) to lower prices, widely disseminates Meta's version of AI, and allows Meta to include improvements from the open source community. Meta claims its Llama models have been downloaded over 350 million times and are utilized by major firms resembling Zoom, AT&T and Goldman Sachs.

For a lot of these developers and corporations, it’s irrelevant that the Llama models should not “open” within the strict sense. Meta's license restrictions how specific developers can use them; Platforms with over 700 million monthly users must apply to Meta for a special license, which the corporate grants at its sole discretion.

Admittedly, there aren't many platforms of this size without their very own in-house models. But Meta isn't particularly transparent concerning the process. When I asked the corporate this month whether it had already approved a Llama discretionary license for a platform, a spokesperson told me that Meta “has nothing to share on this matter.”

Make no mistake, meta plays endlessly. It's about expenses Millions is lobbying regulators to embrace its preferred variant of “open” AI, breaking into servers, data centers and network infrastructure to coach future models.

None of the Llama 3.2 models solve the overarching problems of today's AI, resembling its tendency to invent things and problematic training data (e.g copyrighted e-books which will have been used without permission, the topic of a category motion lawsuit against Meta). But as I've written before, they further one among Meta's major goals: to turn out to be synonymous with AI, and generative AI particularly.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read