Elon Musk pronounces Grok-1.5, approaching GPT-4 level performance

March 30, 2024

348

Just weeks after the open source release of Grok-1, Elon Musk xAI has announced an updated version of its proprietary Large Language Model (LLM) – Grok-1.5.

Scheduled for release next week, Grok-1.5 offers improved reasoning and problem-solving capabilities, approaching the performance of well-known open and closed LLMs, including OpenAI's GPT-4 and Anthropic's Claude 3. It can be able to handling long contexts, but persists behind Gemini 1.5 Pro's context window with as much as 1 million tokens.

Musk noted that Grok-1.5 will support xAI's ChatGPT-challenging chatbot on the X platform, while Grok-2, the successor to the brand new model, continues to be within the training phase. He said the subsequent version should have the ability to “outperform current AI in all metrics,” but didn’t share details about when it is perhaps available.

What does Grok-1.5 bring with it?

xAI announced Grok-1 last November, saying the AI was modeled after The Hitchhiker's Guide to the Galaxy and will answer almost anything to help humanity in its seek for understanding and knowledge – no matter background or political beliefs . On benchmarks like GSM8K, HumanEval and MMLU, divided According to xAI, Grok-1 outperformed Llama-2-70B and GPT-3.5.

Now, with the discharge of Grok-1.5, the corporate is constructing on this work and delivering significant improvements over the previous model in all key benchmarks, including those related to coding and math tasks.

“In our testing, Grok-1.5 achieved a rating of fifty.6% on the MATH benchmark and a rating of 90% on the GSM8K benchmark, two math benchmarks that cover a big selection of competitive problems from elementary to highschool. Additionally, it scored 74.1% on the HumanEval benchmark, which assesses code generation and problem-solving skills,” xAI noted in a blog post.

On the MMLU benchmark, which evaluates the language understanding abilities of AI models on various tasks, the brand new model achieved 81.3%, significantly surpassing Grok-1's 73%.

In addition, xAI also confirmed that Grok-1.5 has a context window of as much as 128,000 tokens (tokens are entire parts or subsections of words, images, videos, audio or code). This allows the model to soak up and process large amounts of knowledge without delay – 16 times greater than Grok-1, making it higher for analyzing, summarizing and extracting information from long documents. It may even handle longer and more complex prompts while still retaining the power to follow instructions.

We are getting closer to OpenAI and Anthropic

With improved reasoning and problem-solving capabilities, Grok-1.5 not only outperforms its predecessor in benchmarks, but additionally approaches popular open and closed source models, including Gemini 1.5 Pro, GPT-4 and Claude 3.

On MMLU, for instance, Grok-1.5 outperforms the recently launched Mistral Large at 81.3%, but lags behind Gemini 1.5 Pro (83.7%), GPT-4 (86.4% as of March 2023) and Claude 3 Opus (86 ,eighth %). An analogous gap was present in the GSM8K benchmark, with the xAI model falling just behind offerings from Google, OpenAI and Anthropic.

Notably, the one benchmark where Grok-1.5 appeared to have a bonus was HumanEval, where it outperformed all models except Claude 3 Opus. xAI expects to proceed these improvements and achieve further performance improvements with Grok-2, which Musk says should outperform current AI in all metrics. The model is currently being trained.

Brian Roemmelea technology consultant, said that due to its work with Grok-1, Grok-2 “will likely be probably the most powerful LLM AI platforms when released. It will outperform OpenAI in almost every way.”

? Based on my research on open source Grok-1, I’m confident that Grok-2 will likely be probably the most powerful LLM AI platforms when released. It will outperform OpenAI in almost every metric.

— Brian Roemmele (@BrianRoemmele) March 29, 2024

Availability of Grok-1.5

As for Grok-1.5, xAI plans to begin deploying it next week. The company says the model will initially be available to early testers and people already using the Grok chatbot on the X platform (Twitter) – with real-time access to all posts on the platform. The rollout will likely be gradual, with the corporate improving the model and introducing several recent features – likely including a brand new fun mode – and regularly making it available to a wider range of users.

Grok has a standard mode and a fun mode. Tonight we decided so as to add a crazy fun mode. It's next level??

— Elon Musk (@elonmusk) March 27, 2024

When Musk made Grok available on divided that the chatbot can even be activated for all premium subscribers who pay $8 per 30 days. In one other To updateHe also confirmed that followers with a certain level of verified subscriber followers will receive Premium and Premium+ subscription advantages, including Grok, free of charge.

Elon Musk pronounces Grok-1.5, approaching GPT-4 level performance

What does Grok-1.5 bring with it?

We are getting closer to OpenAI and Anthropic

Availability of Grok-1.5

LEAVE A REPLY Cancel reply

Must Read

Hands-on with Bee, Amazon's newest AI wearable

New Zealand's low productivity is commonly attributed to the undeniable fact that corporations remain small. That might be a strength in 2026

I used AI chatbots as a news source for a month they usually were unreliable and buggy

As a part of the “physical AI” takeover of CES 2026

Humanoid robots or human connection? What Elon Musk's Optimus reveals about our AI ambitions

3 questions: How AI could optimize the ability grid

Decoding the Arctic to predict winter weather

Latest articles

Hands-on with Bee, Amazon's newest AI wearable

New Zealand's low productivity is commonly attributed to the undeniable fact that corporations remain small. That might be a strength in 2026

I used AI chatbots as a news source for a month they usually were unreliable and buggy

Our Newsletter

Elon Musk pronounces Grok-1.5, approaching GPT-4 level performance

What does Grok-1.5 bring with it?

We are getting closer to OpenAI and Anthropic

Availability of Grok-1.5

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter