Unlock Editor's Digest totally free
FT editor Roula Khalaf selects her favourite stories on this weekly newsletter.
Competitors of Nvidia, which dominates the AI ​​chip market, have long hoped that a turning point would help them regain lost ground.
That point can have been reached. So far, nonetheless, there may be little sign of Nvidia giving up its lead – even though it continues to be an open query whether the AI ​​market will evolve in a way that ultimately undermines its dominance.
The key issue is that the main target in AI is shifting from training the big “foundation models” that underlie modern AI systems to the broad use of those models in applications utilized by large numbers of consumers and businesses.
Thanks to their ability to perform multiple calculations in parallel, Nvidia's powerful graphics processing units (GPUs) have asserted their dominance in data-intensive AI training. In contrast, running queries on these AI models – generally known as inference – is a less demanding activity that would provide a possibility for makers of less powerful – and cheaper – chips.
Those expecting a fast turnaround shall be disillusioned. Nvidia's lead on this recent market already looks impressive. When it announced its latest results on Thursday, it said that greater than 40 percent of knowledge center revenues over the past 12 months were already tied to inference, representing greater than $33 billion in revenue. That's greater than two and a half times the entire revenue of Intel's data center division over the identical period.
But how the inference market will develop from here is uncertain. Two questions will determine the final result: whether the AI ​​business will proceed to be dominated by a race to develop ever larger AI models and where most inference will happen.
Nvidia's success is closely tied to the race to scale. CEO Jensen Huang said this week that it needs “10, 20, 40 times more computing power” to coach each recent generation of huge AI models, guaranteeing huge demand for Nvidia's upcoming Blackwell chips. These recent processors may also provide probably the most efficient option to run inference on these “multi-trillion parameter models,” he added.
However, it will not be clear whether ever-larger models will proceed to dominate the market or whether they are going to eventually reach a degree where their returns diminish. At the identical time, smaller models that promise lots of the same advantages are already coming into vogue, as are less powerful models designed for narrower tasks. Meta, for instance, recently claimed that its recent Llama 3.1, despite its significant downsizing, can match the performance of advanced models like OpenAI's GPT-4.
Improved training techniques, often based on larger amounts of high-quality data, have helped. Once trained, the most important models may also be “distilled” into smaller versions. Such developments promise to bring more of the work of AI inference into smaller or “edge” data centers and onto smartphones and PCs. “AI workloads are moving closer to where the info is or where the users are,” says Arun Chandrasekaran, an analyst at Gartner.
The variety of competitors who’ve their eye on this young market is growing rapidly. Mobile chip manufacturer Qualcomm, for instance, is the primary manufacturer to provide chips that may power a brand new class of PCs with artificial intelligence. They correspond to a design created by Microsoft – a development that directly challenges long-time market leader in PC chips, Intel.
The data center market has now attracted a big selection of potential competitors, from startups like Cerebras and Groq to tech giants like Meta and Amazon, which have developed their very own inference chips.
It's inevitable that Nvidia will lose market share as AI inference moves to devices it doesn't have already got a presence on and into the info centers of cloud corporations that favor proprietary chip designs. To defend its turf, nonetheless, the corporate is leaning heavily on the software strategy that has long acted as a protective wall around its hardware, with tools that make it easier for developers to make use of its chips.
This time, the corporate is working on a broader range of enterprise software designed to assist corporations develop applications that benefit from AI — which might also guarantee demand for its chips. Nvidia said this week that it expects its revenue from this software to succeed in $2 billion in annual revenue by the tip of this 12 months. That's a small number for an organization that expects total revenue to exceed $100 billion, but it surely indicates the increasing adoption of technologies that ought to increase product “stickiness.” The AI ​​chip market could also be entering a brand new phase, but Nvidia's grip doesn't appear to be weakening.