Openais GPT-5 rollout doesn't run easily

August 9, 2025

339

The start of the long-awaited latest model from Openai, GPT-5, is a rocky start to precise the least.

Even Awarded errors in diagrams And Language demos During yesterday Livestreamed presentation of the brand new model (actually 4 separate models and a “mode of considering” that will be involved for 3), a The variety of user reports has emerged for the reason that publication of GPT-5 When solving relatively easy problems that answer in front of Openai models – and competitors from competing AI laboratories – accurately.

For example data scientists Colin Fraser posted screenshots show GPT-5 a false math proof (whether 8.888 repetition equal to 9 is in fact not).

It too failed with an easy algebra arithmetics problem These elementary school students could probably nail, 5.9 = x + 5.11.

use GPT-5 to evaluate Openais own faulty presentation diagrams.

It also failed This tougher math problem below (What to be fair, initially amazed these people …Although Elon Musks Grok 4 Ai answered it accurately. For a sign that flag stones on this case can’t be divided into smaller parts. As 80 separate units you may have to stay in time, so no halves or quarters).

The older 4O model was carried out Better for me to not less than considered one of these mathematical problems. Unfortunately, Openai is slowly recycling these older models-the previous standard GPT-4O and the powerful argumentation model O3 – For the foreseeable future, you’ll proceed to be available for the foreseeable future in the appliance programming interface (API) for developers.

Do not indicate the coding in addition to benchmarks

Although Openais have shown internal benchmarks and a few external third -party providers GPT-5 to surpass all other models in codingPresent It seems that the recently updated Claude Opus 4.1 by Anthropic in “One Shing” seems to do higher tasks higher in using real worldThis implies that to finish the specified application or software creation of the user to your specifications. See An example below by developer Justin Sun, who was published after X :

Opus 4.1-one-shot test, “Create a 3D Capybara petting zoo”.

To be honest, this was pretty crazy, not only the Capybaras paths and more moving, there are also individual PET affinity levels, a day/night switch, feeding and even a screenshot function pic.twitter.com/fikto3fkk4

– Justin (@Justinsunyt) August 7, 2025

Also AREPORT from the SPLX security company It found that the inner safety layer of Openai in areas corresponding to business orientation and susceptibility gave vital gaps so as to arrange the injection and to disguise logical attacks.

While the temperature is checking how the model with early AI adopters indicates cool reception, the temperature appears to be checked.

AI influencer and the previous Google Bilawal Sidhu have published a survey On X he asks for a “vibe check” from his followers and the broader user base and to this point with 172 votes within the Overwhelming answer is “a bit in the center”.

Okay, GPT-5 Vibe Check

– Bilawal Sidhu (@Bilawalsidhu) August 7, 2025

And like that Pseudonyms AI -Ki -Lecks and Message account writtenPresent “The overwhelming consensus via GPT-5 each from X and Reddit AMA is overwhelming.”

The overwhelming consensus via GPT-5 each from X and Reddit AMA are overwhelmingly negative

Most users are annoyed by the broken model picker and non-Pro users don’t have any access to Legacy models

What are your first thoughts about GPT-5?

– Ki -Slecks and Nachrichten (@Aileaksandnews) August 8, 2025

Tibor Blaho, senior engineer at Aiprm and a well-liked AI leakage leaks and news poster to X, collected the various problems with that together Chatgpt-5 rollout in a superb contributionMark considered one of the brand new pageant features -A automatic “router” in Chatgpt, which has develop into considered one of the important difficulties, depending on the problem of the query, a considering or not considering mode for the underlying GPT-5 model depending on the problem of the query. In view of the model, many users looked as if it would withstand non-thinking mode by default.

A bit of sad how the GPT-5 start goes so far as, especially after a protracted waiting time and high expectations

– The automatic switching between models (the router) appears to be partially broken/unreliable

– It is unclear which model you really interact with (standard or mini, …

– Tibor Blaho (@Btibor91) August 8, 2025

Competition within the wings is waiting

So the The feeling within the direction of Chatgpt-5 is anything but universally positive and shows a giant problem for Openai Since it’s an increasing competition by large US rivals corresponding to Google and Anthropic and a growing list of free, open source and high -performance Chinese LLMs which are missing within the USA.

Take that Alibaba Qwen Team of AI researchers, WHO Only today has its high -quality QWen 3 model updated to have a token context of 1 million – Give users the chance to exchange almost 4x information with the model in a single back/R interaction as GPT-5 offers.

In view of the opposite great publication of Openai on this week of latest open source GPT models, a mixed reception of early users. Things are currently not on the lookout for the dedicated AI company primary (700 million weekly lively Chatgpt users from this month).

In fact, this can be being illustrated by User of the polymarket competition Overwhelming, after the publication of GPT-5, the choice is made Google would probably have the perfect AI model until the tip of this month, August 2025.

Other power users like Other AI co-founder and CEO Matt Shumerwhich received early access to GPT-5 and I blogged it positively here in a review herePresent I said that the views would change if more people have found the perfect ways to make use of the brand new model and have adapted their integration approaches:

Many individuals who have a nasty experience use GPT-5 in agent belts that aren’t yet optimized.

For every latest model publication, there may be a time delay between release +when firms that integrate the model really work well.

Agent firms rush to …

– Matt Shumer (@mattshumer_) August 8, 2025

While it remains to be early for GPT-5-and the sensation could change dramatically if more users get it into their hands and take a look at it for various tasks Early indications don't appear like this Just like earlier publications corresponding to GPT-4 and even the newer 4o and O3. And that's a concerned indicator for An organization that has just raised one other round of financingNevertheless, on account of its high costs for research and development, stays unprofitable.

Openais GPT-5 rollout doesn’t run easily

Do not indicate the coding in addition to benchmarks

Competition within the wings is waiting

LEAVE A REPLY Cancel reply

Must Read

Why it's essential to maneuver beyond overly aggregated machine learning metrics

The EU's latest AI framework may even impact UK businesses and consumers

AI can't automate science – a philosopher explains the uniquely human points of research

Sexualized deepfakes on X are an indication of things to come back. New Zealand law is already lagging far behind

The next generation of driverless cars might want to take into consideration what's on the road, not only what they see

Moxie Marlinspike offers a privacy-conscious alternative to ChatGPT

Microsoft's AI deal guarantees digital sovereignty for Canada, but is that a promise the country can keep?

Latest articles

Why it's essential to maneuver beyond overly aggregated machine learning metrics

The EU's latest AI framework may even impact UK businesses and consumers

AI can't automate science – a philosopher explains the uniquely human points of research

Our Newsletter

Openais GPT-5 rollout doesn’t run easily

Do not indicate the coding in addition to benchmarks

Competition within the wings is waiting

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter