HomeArtificial IntelligenceCerebras breaks ground on Condor Galaxy 3, an AI supercomputer able to...

Cerebras breaks ground on Condor Galaxy 3, an AI supercomputer able to 8 ExaFLOPs

cerebrum and G42 said they’ve broken ground on Condor Galaxy 3, an AI supercomputer able to eight exaFLOPs of performance.

“That’s plenty of power delivered across 58 million AI-optimized cores,” said Andrew Feldman, CEO of Sunnyvale, California Brains, in an interview with VentureBeat. And it goes to G42, a nationwide cloud and generative AI enabler based in Abu Dhabi within the United Arab Emirates. It will likely be considered one of the biggest AI supercomputers on this planet, Feldman said.

Equipped with 64 of Cerebras' newly announced CS-3 systems – all powered by what Feldman says is the industry's fastest AI chip, the Wafer-Scale Engine 3 (WSE-3) – Condor Galaxy 8 will deliver 58 million AI ExaFLOPs . optimized cores.

“We have built large, fast AI supercomputers. We began constructing clusters and the clusters got greater, after which the cluster got even greater,” Feldman said. “And then we began training huge models with it.”

When it involves “Chip,” Cerebras takes a fairly unique approach. Although the corporate designs its cores small, they’re spread across a complete semiconductor wafer – typically used for tons of of chips. Using the identical substrate for its chips makes communication faster and processing more efficient. 900,000 cores fit on a single chip or a reasonably large wafer.

Located in Dallas, Texas, Condor Galaxy 3 is the third installation of the Condor Galaxy network of AI supercomputers. The strategic partnership between Cerebras and G42 has already delivered 8 ExaFLOPs of AI supercomputing performance across Condor Galaxy 1 and Condor Galaxy 2, each amongst the biggest AI supercomputers on this planet.

With Condor Galaxy 3, the present total of the Condor Galaxy network increases to 16 exaFLOPs. By the top of 2024, Condor Galaxy will deliver greater than 55 ExaFLOPs of AI computing power. In total, Cerebras will construct nine AI supercomputers for G42.

Cerebras CS-3 up close.

“With Condor Galaxy 3, we proceed to comprehend our shared vision of remodeling the world’s AI computing power base by developing the world’s largest and fastest AI supercomputers,” G42 Group CTO Kiril Evtimov said in a press release. “The existing Condor Galaxy network has trained a number of the industry's leading open source models with hundreds of thousands of downloads, and we look ahead to the following wave of innovation that Condor Galaxy supercomputers can enable with twice the performance.”

At the guts of the 64 Cerebras CS-3 systems that make up the Condor Galaxy 3 is the brand new 5-nanometer chip WSE-3, which delivers twice the performance with the identical power consumption and the identical cost. Designed specifically to coach the industry's largest AI models, the 4 trillion transistor WSE-3 delivers an astounding 125 petaflops of peak AI performance with 900,000 AI-optimized cores per chip.

“We are honored that our newly announced CS-3 systems will play a critical role in our groundbreaking strategic partnership with G42,” said Feldman. “Condor Galaxy 3 through Condor Galaxy 9 will each utilize 64 of the brand new CS-3, increasing the computing power we’ll provide from 36 exaFLOPs to greater than 55 exaFLOPs.” This represents a major milestone in AI computing and provides unparalleled computing power and efficiency.”

Condor Galaxy has trained generative AI models including Jais-30B, Med42, Crystal-Coder-7B and BTLM-3B-8K. Jais 13B and Jais30B are the very best bilingual Arabic models on this planet, now available on the Azure Cloud. BTLM-3B-8K is the leading 3B model on HuggingFace, providing 7B parameter performance in a light-weight 3B parameter model for inference, the corporate said.

Med42, developed with M42 and Core42, is a number one clinical LLM trained in a weekend on the Condor Galaxy 1 and outperforms MedPaLM when it comes to performance and accuracy.

The Condor Galaxy 3 will likely be available within the second quarter of 2024.

Wafer Scale Engine 3

Cerebras Condor Galaxy on the Colovore Data Center

In other news, Cerebras talked in regards to the chip that powers the supercomputer. With the introduction of the Wafer Scale Engine 3, the corporate doubled its existing world record because the fastest AI chip.

The WSE-3 delivers twice the performance of the previous record holder Cerebras WSE-2 with the identical power consumption and at the identical price. Designed specifically to coach the industry's largest AI models, the 5nm-based 4 trillion transistor WSE-3 powers the Cerebras CS-3 AI supercomputer and delivers 125 petaflops of peak AI performance via 900,000 AI-optimized cores.

Feldman said the pc will likely be delivered on 150 pallets.

“We are announcing our five-nanometer part for our current generation wafer-scale engine. This is the fastest chip on this planet. It's a 46,000 square millimeter part manufactured at TSMC. In the five-nanometer node, there are 4 trillion transistors, 900,000 AI cores and 125 petaflops of AI computing power,” he said.

With an enormous storage system of as much as 1.2 petabytes, the CS-3 is designed to coach next-generation Frontier models 10 times larger than GPT-4 and Gemini. 24 trillion parameter models could be stored in a single logical storage area without partitioning or refactoring, greatly simplifying training workflow and accelerating developer productivity. Training a model with a trillion parameters on the CS-3 is just as easy as training a model with a billion parameters on GPUs.

The CS-3 is designed for each enterprise and hyperscale needs. Compact configurations with 4 systems can enable fine-tuning of 70B models in a day, while Llama 70B could be fully trained from scratch with 2048 systems in a single day – an unprecedented achievement for generative AI.

The latest Cerebras Software Framework provides native support for PyTorch 2.0 and the newest AI models and techniques reminiscent of Multimodal Models, Vision Transformer, Expert Blending and Diffusion. Cerebras stays the one platform that gives native hardware acceleration for dynamic and unstructured sparsity, accelerating training by as much as 8x.

“When we began this journey eight years ago, everyone said wafer-scale processors were a pipe dream. We couldn’t be prouder to introduce the third generation of our groundbreaking spirit level AI chip,” said Feldman. “WSE-3 is the world's fastest AI chip, purpose-built for the newest cutting-edge AI work, from mixing experts to 24 trillion parameter models. We are excited to bring WSE-3 and CS-3 to market to assist solve today’s biggest AI challenges.”

Because every component is optimized for AI work, CS-3 offers more computing power in less space and uses less power than another system. While the facility consumption of GPUs doubles from generation to generation, the CS-3 doubles the performance but stays throughout the same performance limit. The CS-3 offers superior ease of use, requires 97% less code than GPUs for LLMs, and the flexibility to coach models with parameters from 1B to 24T in data-only parallel mode. A normal implementation of a GPT-3 model required only 565 lines of code on Cerebras – an industry record.

“We support models with as much as 24 trillion parameters,” Feldman said.

Industry partnerships and customer dynamics

Cerebras already has a major backlog for CS-3 in enterprise, government and international clouds.

“We have been an early customer of Cerebras solutions from the start and have been in a position to rapidly speed up our scientific and medical AI research because of the 100- to 300-fold performance improvements provided by Cerebras' wafer-scale technology,” said Rick Stevens, Associate Laboratory Director from Argonne National Laboratory for Computer, Environmental and Life Sciences, in a press release. “We look ahead to seeing what breakthroughs CS-3 will enable with twice the facility in the identical scope.”

Qualcomm deal

Cerebrum WSE-3

This week, Cerebras also announced a brand new technical and GTM collaboration with Qualcomm to attain 10x AI inference performance through the advantages of Cerebras' inference-aware training on CS-3.

“Our technological collaboration with Cerebras allows us to supply our customers essentially the most powerful AI training solution combined with the very best Perf/TCO$ inference solution. Additionally, customers can receive fully optimized, ready-to-use models, which also dramatically reduces time to ROI,” Rashid Attar, vp of cloud computing at Qualcomm, said in a press release.

Using Cerebras' industry-leading CS-3 AI accelerators for training and Qualcomm Cloud AI 100 Ultra for inference, production-grade deployments can achieve a 10x price/performance improvement.

“We are announcing a worldwide partnership with Qualcomm to coach models optimized for his or her inference engine. And so this partnership allows us to leverage a spread of techniques which might be unique to us, a few of that are more widely available, to radically reduce the fee of inference,” Feldman said. “So this can be a partnership where we train models to speed up inferences on several different strategies.”

Cerebras employs greater than 400 engineers. “It is difficult to deliver huge amounts of computing on time. And I don't think there may be one other player on this category. Every other startup that has delivered the quantity of computing power we have now within the last six months. And along with Qualcomm, we’re reducing inference costs,” said Feldman.


Please enter your comment!
Please enter your name here

Must Read