HomeArtificial IntelligenceElon Musk pronounces Grok-1.5, approaching GPT-4 level performance

Elon Musk pronounces Grok-1.5, approaching GPT-4 level performance

Just weeks after the open source release of Grok-1, Elon Musk xAI has announced an updated version of its proprietary Large Language Model (LLM) – Grok-1.5.

Scheduled for release next week, Grok-1.5 offers improved reasoning and problem-solving capabilities, approaching the performance of well-known open and closed LLMs, including OpenAI's GPT-4 and Anthropic's Claude 3. It can be able to handling long contexts, but persists behind Gemini 1.5 Pro's context window with as much as 1 million tokens.

Musk noted that Grok-1.5 will support xAI's ChatGPT-challenging chatbot on the X platform, while Grok-2, the successor to the brand new model, continues to be within the training phase. He said the subsequent version should have the ability to “outperform current AI in all metrics,” but didn’t share details about when it is perhaps available.

What does Grok-1.5 bring with it?

xAI announced Grok-1 last November, saying the AI ​​was modeled after The Hitchhiker's Guide to the Galaxy and will answer almost anything to help humanity in its seek for understanding and knowledge – no matter background or political beliefs . On benchmarks like GSM8K, HumanEval and MMLU, divided According to xAI, Grok-1 outperformed Llama-2-70B and GPT-3.5.

Now, with the discharge of Grok-1.5, the corporate is constructing on this work and delivering significant improvements over the previous model in all key benchmarks, including those related to coding and math tasks.

“In our testing, Grok-1.5 achieved a rating of fifty.6% on the MATH benchmark and a rating of 90% on the GSM8K benchmark, two math benchmarks that cover a big selection of competitive problems from elementary to highschool. Additionally, it scored 74.1% on the HumanEval benchmark, which assesses code generation and problem-solving skills,” xAI noted in a blog post.

On the MMLU benchmark, which evaluates the language understanding abilities of AI models on various tasks, the brand new model achieved 81.3%, significantly surpassing Grok-1's 73%.

In addition, xAI also confirmed that Grok-1.5 has a context window of as much as 128,000 tokens (tokens are entire parts or subsections of words, images, videos, audio or code). This allows the model to soak up and process large amounts of knowledge without delay – 16 times greater than Grok-1, making it higher for analyzing, summarizing and extracting information from long documents. It may even handle longer and more complex prompts while still retaining the power to follow instructions.

We are getting closer to OpenAI and Anthropic

With improved reasoning and problem-solving capabilities, Grok-1.5 not only outperforms its predecessor in benchmarks, but additionally approaches popular open and closed source models, including Gemini 1.5 Pro, GPT-4 and Claude 3.

On MMLU, for instance, Grok-1.5 outperforms the recently launched Mistral Large at 81.3%, but lags behind Gemini 1.5 Pro (83.7%), GPT-4 (86.4% as of March 2023) and Claude 3 Opus (86 ,eighth %). An analogous gap was present in the GSM8K benchmark, with the xAI model falling just behind offerings from Google, OpenAI and Anthropic.

Notably, the one benchmark where Grok-1.5 appeared to have a bonus was HumanEval, where it outperformed all models except Claude 3 Opus. xAI expects to proceed these improvements and achieve further performance improvements with Grok-2, which Musk says should outperform current AI in all metrics. The model is currently being trained.

Brian Roemmelea technology consultant, said that due to its work with Grok-1, Grok-2 “will likely be probably the most powerful LLM AI platforms when released. It will outperform OpenAI in almost every way.”

Availability of Grok-1.5

As for Grok-1.5, xAI plans to begin deploying it next week. The company says the model will initially be available to early testers and people already using the Grok chatbot on the X platform (Twitter) – with real-time access to all posts on the platform. The rollout will likely be gradual, with the corporate improving the model and introducing several recent features – likely including a brand new fun mode – and regularly making it available to a wider range of users.

When Musk made Grok available on divided that the chatbot can even be activated for all premium subscribers who pay $8 per 30 days. In one other To updateHe also confirmed that followers with a certain level of verified subscriber followers will receive Premium and Premium+ subscription advantages, including Grok, free of charge.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read