All-encompassing, highly generalizable generative AI models were once the be-all and end-all and arguably still are. However, as more cloud providers large and small join the generative AI fray, we’re seeing a brand new generation of models focused on essentially the most financially powerful potential customer: the enterprise.
A typical example: Snowflake, the cloud computing company, today introduced Arctic LLM, a generative AI model described as “enterprise class.” Available under an Apache 2.0 license, Arctic LLM is optimized for “enterprise workloads,” including database code generation, in line with Snowflake, and is free for research and business use.
“I feel this will probably be the muse that may enable us – Snowflake – and our customers to construct enterprise-class products and truly begin to appreciate the promise and value of AI,” CEO Sridhar Ramaswamy said in a press release Press conference. “You should consider this our first but major step into the world of generative AI, with many more to come back.”
A company model
My colleague Devin Coldewey recently wrote about how there isn’t any end in sight to the onslaught of generative AI models. I like to recommend you read his article, however the gist is: Models are a simple way for vendors to generate excitement for his or her R&D, and additionally they function a funnel into their product ecosystems (e.g. model hosting, fine-tuning, etc. ).
Arctic LLM isn’t any different. Snowflake's flagship model in a single Family of generative AI models called ArcticArctic LLM — which took about three months, 1,000 GPUs, and $2 million to coach — follows Databricks' DBRX, a generative AI model also marketed as optimized for the enterprise sector.
Snowflake makes a direct comparison between Arctic LLM and DBRX in its press materials, saying that Arctic LLM outperforms DBRX on the 2 tasks of coding (Snowflake didn’t specify which programming languages). SQL Generation. The company said Arctic LLM can also be higher at these tasks than Meta's Llama 2 70B (but not the newer Llama 3 70B) and Mistral's Mixtral-8x7B.
Snowflake also claims that Arctic LLM achieves “leading performance” on a preferred general language comprehension benchmark, MMLU. However, I would love to indicate that MMLU is meant to judge the power of generative models to resolve logical problems. It includes tests that may be solved by memorizing. So take this point with caution.
“Arctic LLM addresses specific needs within the enterprise sector,” said Baris Gultekin, head of AI at Snowflake, in an interview with TechCrunch, “and moves away from generic AI applications like poetry writing to concentrate on business-focused challenges like development “to concentrate on SQL collaborations.” Pilots and high-quality chatbots.”
Arctic LLM, like DBRX and Google's current strongest generative version, Gemini 1.5 Pro, is a mix of Expert Architecture (MoE). MoE architectures fundamentally break down data processing tasks into subtasks after which delegate them to smaller, specialized “expert” models. So while Arctic LLM accommodates 480 billion parameters, it only prompts 17 billion at a time – enough to power the 128 separate expert models. (Parameters essentially define an AI model's capabilities for an issue, corresponding to analyzing and generating text.)
Snowflake claims that this efficient design allowed it to coach Arctic LLM on open public web datasets (including). RefinedWeb, C4, Pajamas red And StarCoder) at “roughly one-eighth the fee of comparable models.”
Run in all places
Snowflake provides resources corresponding to coding templates and a listing of coaching sources alongside Arctic LLM to guide users through the strategy of getting the model running and optimizing it for specific use cases. However, recognizing that these are likely costly and sophisticated undertakings for many developers (fine-tuning or running Arctic LLM requires around eight GPUs), Snowflake also guarantees to make Arctic LLM available on a variety of hosts, including Hugging Face and Microsoft Azure, Together AI's model hosting service and enterprise generative AI platform Lamini.
But here's the catch: Arctic LLM will probably be available on Cortex, Snowflake's platform for developing apps and services powered by AI and machine learning. Not surprisingly, the corporate touts it as the popular option to run Arctic LLM with “security,” “governance,” and scalability.
“Our dream here is to have an API inside a yr that our customers can use to enable business users to speak directly with data,” said Ramaswamy. “That would have been it It was easy for us to say, “Oh, we’ll just wait for an open source model and use it.” Instead, we’re making a fundamental investment because we imagine it would deliver greater value to our customers.”
So I'm wondering: who’s Arctic LLM really suitable for, apart from Snowflake customers?
In a landscape filled with “open” generative models that may be fine-tuned for virtually any purpose, Arctic LLM doesn’t stand out in any way. Its architecture could lead to efficiencies in comparison with another options in the marketplace. However, I’m not convinced that they will probably be dramatic enough to steer firms away from the countless other well-known and supported business-friendly generative models (e.g. GPT-4).
There can also be a reason why Arctic LLM is unfavorable: its relatively small context.
In generative AI, the context window refers to input data (e.g. text) that a model considers before generating output (e.g. more text). Models with small context windows are inclined to forget the content of even very recent conversations, while models with larger contexts often avoid this danger.
Arctic LLM's context ranges from about 8,000 to about 24,000 words, depending on the tuning method – well below that of models like Anthropic's Claude 3 Opus and Google's Gemini 1.5 Pro.
Snowflake doesn't mention it within the marketing, but Arctic LLM almost definitely suffers from the identical limitations and shortcomings as other generative AI models – namely hallucinations (i.e. self-aware, incorrectly answering queries). That's because Arctic LLM, like every other generative AI model in existence, is a statistical probability engine – a machine that, in turn, has a small window of context. It uses quite a lot of examples to guess which data makes essentially the most “sense” to position where (e.g. the word “go” before “the market” within the sentence “I am going to the market”). Inevitably you’ll guess fallacious – and that could be a “hallucination”.
As Devin writes in his article, we will only look ahead to incremental improvements in the sphere of generative AI until the following major technical breakthrough. But that won't stop vendors like Snowflake from glorifying them as great achievements and marketing them in every way possible.