HomeArtificial IntelligenceOpen source tools from Google to support AI model development

Open source tools from Google to support AI model development

In a typical 12 months, Cloud Next – certainly one of Google's two major annual developer conferences together with I/O – almost exclusively showcases managed and otherwise closed-source services and products with gated-behind-locked APIs. But this 12 months, Google unveiled a series of open source tools aimed primarily at supporting generative AI projects and infrastructure, whether to foster developer goodwill or advance its ecosystem ambitions (or each). .

The first, MaxDiffusion, which Google actually quietly released in February, is a set of reference implementations of assorted diffusion models – models just like the Stable Diffusion image generator – that run on XLA devices. “XLA” stands for Accelerated Linear Algebra, an admittedly strange acronym that refers to a way that optimizes and accelerates certain kinds of AI workloads, including tuning and deployment.

Google's own Tensor Processing Units (TPUs) are XLA devices, as are current Nvidia GPUs.

Beyond MaxDiffusion, Google is launching Jet stream, a brand new engine for running generative AI models – especially text-generating models (in line with Stable Diffusion). Currently limited to supporting TPUs with supposed future GPU compatibility, JetStream offers as much as 3 times more “performance per dollar” on models like Google's own Gemma 7B and Meta's Llama 2, in line with Google.

“As customers move their AI workloads into production, there may be increasing demand for an economical inference stack that delivers high performance,” wrote Mark Lohmeyer, GM of compute and machine learning infrastructure at Google Cloud, in a blog entry shared with TechCrunch. “JetStream helps address this need…and includes optimizations for popular open models like Llama 2 and Gemma.”

Now, a “3x” improvement is actually a claim, and it’s not entirely clear how Google arrived at that number. Which TPU generation to make use of? Compared to which base engine? And how is “performance” even defined here?

I asked Google all these questions and can update this post if I hear anything.

Second to last on Google's list of open source contributions are latest additions MaxText, Google's collection of text-generating AI models targeting TPUs and Nvidia GPUs within the cloud. MaxText now includes Gemma 7B, OpenAI's GPT-3 (the predecessor to GPT-4), Llama 2, and models from AI startup Mistral – all of which may be customized and tailored to developers' needs, in line with Google.

“We have heavily optimized the performance (of the models) on TPUs and have also worked closely with Nvidia to optimize performance on large GPU clusters,” Lohmeyer said. “This Improvements maximize GPU and TPU utilization and lead to greater power efficiency and value optimization.”

Finally, Google worked with Hugging Face, the AI ​​startup, to develop it Optimal TPU, which provides tools to bring specific AI workloads to TPUs. According to Google, the goal is to scale back the barrier to entry for transferring generative AI models to TPU hardware – especially for text-generating models.

But in the mean time Optimum TPU remains to be a bit poor. The only model it really works with is Gemma 7B. And Optimum TPU doesn’t yet support training generative models on TPUs – only their execution.

Google's promising improvements across the board.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read