Every week – sometimes even day-after-day – a brand new cutting-edge AI model is born into the world. Heading into 2025, the pace at which latest models are coming to market is dizzying, if not exhausting. The curve of the roller coaster continues to extend exponentially and fatigue and wonder have change into constant companions. Each release highlights why a specific model is healthier than all others, with infinite collections of benchmarks and bar charts filling our feeds as we struggle to maintain up.
18 months ago, the overwhelming majority of developers and corporations were using a single AI model. Today the alternative is the case. It is rare for a corporation of great size to limit itself to the capabilities of a single model. Companies fear vendor lock-in, especially with a technology that has quickly change into a central a part of each long-term corporate strategy and short-term net revenue. It is becoming increasingly dangerous for teams to rely exclusively on a single large language model (LLM).
Yet despite this fragmentation, many model providers still maintain that AI can be a winner-take-all market. They claim that the expertise and computing power required to coach world-class models is scarce, defensible, and self-reinforcing. From their perspective, the hype bubble surrounding AI model development will eventually burst, abandoning a single, massive artificial general intelligence (AGI) model that will be used for anything and every part. Solely owning such a model would mean being probably the most powerful company on the earth. The size of this price has sparked an arms race for an increasing number of GPUs, with a brand new zero being added to the number of coaching parameters every few months.
We imagine this view is improper. There can be no single model that may dominate the universe, neither in the following 12 months nor in the following decade. Instead, the longer term of AI can be multi-modeled.
Language models are fuzzy goods
This defines a commodity as “a standardized good that’s bought and sold on a big scale and whose units are interchangeable.” Language models are goods in two senses:
- The models themselves have gotten increasingly interchangeable for a wider range of tasks.
- The research expertise needed to create these models is becoming increasingly distributed and accessible, with frontier labs barely outdoing one another and independent researchers within the open source community hot on their heels.
But as language models change into commodified, it does so unevenly. There is a big functional core for which each model, from the GPT-4 to the Mistral Small, is perfectly suited. At the identical time, the further we move towards the perimeters and edge cases, we see ever greater differentiation, with some model providers explicitly specializing in code generation, reasoning, retrieval-augmented generation (RAG) or mathematics. This results in infinite hand-wringing, Reddit searching, evaluating, and fine-tuning to search out the best model for every job.
Although language models are commodities, they’re more precise than . described. For many use cases, AI models can be almost interchangeable, with metrics akin to price and latency determining which model to make use of. But at the sting of performance, the alternative will occur: models will proceed to specialize and change into an increasing number of differentiated. For example, Deepseek-V2.5 is stronger than GPT-4o when coding in C#, despite being a fraction of the dimensions and 50 times cheaper.
These two dynamics – commercialization and specialization – challenge the concept that a single model is best fitted to every possible use case. Rather, they point to an increasingly fragmented AI landscape.
Multimodal orchestration and routing
There is an apt analogy for the market dynamics of language models: the human brain. The structure of our brain has remained unchanged for 100,000 years and brains are much more similar than they’re different. For most of our time on Earth, most individuals learned the identical things and had similar skills.
But then something modified. We have developed the power to speak in language – first verbally, then in writing. Communication protocols enable networks, and as people began to network with one another, we also began to specialize an increasing number of. We have been free of the burden of getting to be generalists in all areas, of being self-sufficient islands. Paradoxically, the collective wealth of specialization has also meant that the typical person today is a far stronger generalist than any of our ancestors.
On a sufficiently wide input space, the universe all the time tends toward specialization. This applies to all elements of molecular chemistry, biology and human society. Given sufficient diversity, distributed systems will all the time be more computationally efficient than monoliths. We imagine the identical will apply to AI. The more we will leverage the strengths of multiple models reasonably than counting on only one, the more those models can specialize, expanding the boundaries of capabilities.
An increasingly necessary pattern for leveraging the strengths of various models is routing – dynamically sending queries to probably the most appropriate model while leveraging cheaper, faster models when doing so doesn’t compromise quality. Routing allows us to make the most of all the advantages of specialization—higher accuracy at lower cost and lower latency—without sacrificing the robustness of generalization.
An easy proof of the ability of routing is the undeniable fact that many of the world's top models are routers themselves: they’re made with Mixture of experts Architectures that pass each next generation of tokens to just a few dozen expert submodels. If it’s true that LLMs are exponentially proliferating fuzzy commodities, then routing must change into a necessary a part of any AI stack.
There is a view that LLMs will plateau as they reach human intelligence. As we fully exploit the capabilities, we’ll coalesce right into a single general model in the identical way we did with AWS or the iPhone. None of those platforms (or their competitors) have increased their capabilities tenfold in the previous couple of years – so we would as well familiarize ourselves with their ecosystems. However, we imagine that AI won’t stop at human-level intelligence; it’ll go far beyond any limits we will imagine. Like another natural system, it becomes increasingly fragmented and specialized.
We cannot emphasize enough how fragmenting AI models is a excellent thing. Fragmented markets are efficient markets: they provide power to buyers, maximize innovation and minimize costs. And to the extent that we will leverage networks of smaller, more specialized models, reasonably than passing every part through the internals of a single giant model, we’re moving toward a much safer, more interpretable, and more controllable future for AI.
The best inventions haven’t any owner. Ben Franklin's heirs don't own electricity. Turing's estate doesn't own all the computers. AI is undoubtedly one in all humanity's best inventions; We imagine its future will and needs to be multi-model.