Elon Musk's xAI made waves last week with the discharge of its chatbot Grok-2 (Large Language Model, LLM) – available via a monthly subscription for $8 on the social network X.
Now, each versions of Grok-2—Grok-2 and Grok-2 mini, the latter of which is alleged to be less powerful but faster—have increased the speed with which they’ll analyze information and output answers after two developers at xAI completely rewrote the inference code stack over the past three days.
As xAI Developer Igor Babushkin posted this afternoon on the social network X under his name @ibab:
“Grok 2 mini is now twice as fast because it was yesterday. Over the last three days, @lm_zheng and @MalekiSaeed have been rewriting our inference stack from scratch, with SGLong. This also allowed us to deploy the big Grok 2 model, which requires multi-host inference, at an affordable speed. Both models not only became faster, but additionally barely more accurate. Stay tuned for further speed improvements!”
According to Babushkin’s post, the 2 developers responsible are Lianmin Zheng and Saeed Maleki.
To rewrite the conclusion for Grok-2, they relied on SGLongan open source system (under Apache 2.0 license), a highly efficient system for executing complex language model programs, achieving as much as 6.4 times higher throughput than existing systems.
SGLang was developed by Researchers from Stanford University, the University of California, Berkeley, Texas A&M University, and Shanghai Jiao Tong University and integrates a front-end language with a back-end runtime to simplify the programming of language model applications.
The system is flexible and supports many models, including Llama, Mistral, and LLaVA, and is compatible with open-weight and API-based models resembling OpenAI's GPT-4. SGLang's ability to optimize execution through automatic cache reuse and parallelism inside a single program makes it a strong tool for developers working with large-scale language models.
Performance highlights of Grok-2 and Grok-2-Mini
In addition, in the most recent update of the Third Party Lmsys Chatbot Arena Leaderboard When evaluating the performance of AI models, the essential game Grok-2 secured second place with a powerful Arena rating of 1293 (based on 6686 votes).
This puts Grok-2 (fittingly) in second place among the many world's strongest AI models, tied with Google's Gemini-1.5 Pro model and just behind the most recent version of OpenAI's ChatGPT-4o.
Grok-2-mini, which has also benefited from recent improvements, has climbed to fifth place with an Arena rating of 1268 out of 7266 votes, just behind GPT-4o mini and Claude 3.5 Sonnet.
Both models are proprietary to xAI and reflect the corporate's commitment to advancing AI technology.
Grok-2 has particularly excelled in math tasks, where it ranks first. The model also holds strong positions in several other categories, including hard prompts, coding, and instruction-following, where it consistently ranks in the highest positions.
This performance puts Grok-2 ahead of other distinguished models resembling OpenAI's GPT-4o (May 2024), which is now ranked 4th.
Future developments
According to a solution by Babushkin to X, the essential advantage of using Grok-2-mini over the complete Grok-2 model is higher speed.
However, Babushkin promised that xAI would further improve Grok-2-mini's processing speed, which could make it a good more attractive option for users searching for high performance with less computational effort.
The inclusion of Grok-2 and Grok-2-mini within the Chatbot Arena leaderboard and their subsequent performance have attracted considerable attention inside the AI community.
The success of the models is a testament to xAI's continued innovation and commitment to pushing the boundaries of what will be achieved with AI.
As xAI continues to refine its models, further improvements in speed and accuracy are expected within the AI landscape, keeping Grok-2 and Grok-2-mini on the forefront of AI development.