Do recent AI reasoning models require recent approaches to prompting?

January 14, 2025

96

The era of reasoning AI is in full swing.

After OpenAI once more sparked an AI revolution with its o1 reasoning model, launched in September 2024 – which takes longer to reply questions but pays off in higher performance, especially on complex, multi-step problems in math and science – here comes the business AI space has been flooded with imitators and competitors.

There's DeepSeeks R1, Google Gemini 2 Flash Thinking, and right away LlamaV-o1, all aiming to supply the same integrated “reasoning” to OpenAI's recent o1 and upcoming o3 model families. These models are committed Chain-of-thought (CoT) prompt – or “self-directed” – and forces them to take into consideration their evaluation mid-process, return, check their very own work, and ultimately arrive at a greater answer than simply shooting it off the highest of their head Embeddings as fast as possible, like other large language models (LLMs) do.

However, the high cost for o1 and o1-mini ($15.00/1M input tokens vs. $1.25/1M input tokens for GPT-4o) is high OpenAI's API) has caused some to shrink back from the alleged performance improvements. Is it really price paying 12 times greater than the standard, cutting-edge LLM degree?

As it seems, there are a growing variety of converts—but the important thing to unlocking the true value of reasoning models may lie within the user stimulating them in another way.

Shawn Wang (founding father of AI Intelligence resin) presented on his Substack Over the weekend, a guest post from former Apple Inc. Ben Hylak, interface designer for visionOS (which powers the Vision Pro spatial computing headset). The post went viral since it convincingly explains how Hylak makes OpenAI's o1 model produce incredibly priceless results (for him).

In short, as a substitute of getting the human user write prompts for the o1 model, they need to take into consideration writing “briefs” or more detailed explanations that provide a whole lot of context upfront about what the user wants the model to output and who the user is and in what format the model should output the knowledge for them.

As Hylak continues to put in writing Substack:

“

Hylak also includes an incredible annotated screenshot of a sample prompt for o1 that produced useful results for an inventory of hikes:

This blog post was so helpful that Greg Brockman, President and Co-Founder of OpenAI, re-shared it on his X account News: “o1 is a distinct type of model. To achieve great performance, it must be utilized in a brand new way compared to straightforward chat models.”

I attempted it myself in my recurring quest to develop into fluent in Spanish and here was the resultfor the curious. Maybe not as impressive as Hylak's well-constructed prompt and response, however it definitely shows great potential.

Even in the case of non-reasoning LLMs like Claude 3.5 Sonnet, there could also be room for normal users to enhance their prompts to improve, less limited results.

As Louis Arge, former Teton.ai engineer and current developer of the neuromodulation device openFUS, wrote on X“One trick I've discovered is that LLMs trust their very own requests greater than my requests,” and provided an example of how he convinced Claude to be “less of a coward” by first putting up “a fight.” with him triggered via his exits.

All of this shows that rapid engineering stays a priceless skill because the AI era continues.

Do recent AI reasoning models require recent approaches to prompting?

LEAVE A REPLY Cancel reply

Must Read

OpenAI appoints certainly one of Wall Street's strongest dealmakers to its board

ChatGPT: Everything you must know in regards to the AI-powered chatbot

New computational chemistry techniques speed up the prediction of molecules and materials

The UK has half of what it must be an AI hub

The British government desires to unlock the large potential of AI. While that is welcome, we must rigorously consider the risks

Thoras helps firms achieve reliability without spending an excessive amount of on cloud costs

The British CMA criticizes Google search and its market share of over 90% with an antitrust investigation

Latest articles

OpenAI appoints certainly one of Wall Street's strongest dealmakers to its board

ChatGPT: Everything you must know in regards to the AI-powered chatbot

New computational chemistry techniques speed up the prediction of molecules and materials

Our Newsletter

Do recent AI reasoning models require recent approaches to prompting?

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter