HomeArtificial IntelligenceOpenAI introduces the GPT-4o Omni model, which now supports ChatGPT

OpenAI introduces the GPT-4o Omni model, which now supports ChatGPT

OpenAI announced a brand new flagship model for generative AI on Monday that it's calling GPT-4o — the “o” stands for “omni,” referring to the model's ability to process text, voice and video. GPT-4o is anticipated to be rolled out “iteratively” across the corporate’s developer- and consumer-facing products over the following few weeks.

Mira Murati, CTO of OpenAI, said that GPT-4o provides “GPT-4 level” information but enhances GPT-4’s capabilities across multiple modalities and media.

“GPT-4o reasons for speech, text and image,” Murati said Monday during a streamed presentation at OpenAI’s offices in San Francisco. “And that’s incredibly essential because we’re looking into the longer term of interaction between us and machines.”

GPT-4 Turbo, OpenAI's previous “leading “most advanced” model, was trained on a mixture of images and text and will analyze images and text to perform tasks resembling extracting text from images and even describing the content of them images to meet. But GPT-4o adds language to the combination.

What makes this possible? Quite a lot of things.

Photo credit: OpenAI

GPT-4o significantly improves the experience in OpenAI's AI-powered chatbot ChatGPT. The platform has long offered a voice mode that transcribes the chatbot's responses using a text-to-speech model. However, GPT-4o expands this mode several times, allowing users to interact with ChatGPT more like an assistant.

For example, users can ask the GPT-4o-based ChatGPT an issue and pause ChatGPT while it responds. According to OpenAI, the model delivers “real-time” responsiveness and may even detect nuances in a user’s voice and generate voices in “various emotional styles” (including singing) in response.

GPT-4o also improves ChatGPT's vision capabilities. Using a photograph – or a desktop screen – ChatGPT can now quickly answer related questions on topics like “What’s happening on this software code?” to “What brand of blouse is that this person wearing?”

ChatGPT desktop app is utilized in a coding task.
Photo credit: OpenAI

These features will evolve in the longer term, says Murati. While today GPT-4o can have a look at a picture of a menu in one other language and translate it, in the longer term the model could allow ChatGPT to, for instance, “watch” a live sports game and explain the principles to you.

“We know that these models have gotten increasingly complex, but we would like the experience of interaction to truly change into more natural and easy, and so that you just don't should concentrate on the interface in any respect, but can just concentrate on collaborating with ChatGPT,” said Murati . “Over the previous few years we've focused so much on improving the intelligence of those models… But that is the primary time we're really taking a giant step forward in relation to usability.”

According to OpenAI, GPT-4o can be multilingual, offering improved performance in around 50 languages. And in Microsoft's OpenAI API and Azure OpenAI service, GPT-4o is twice as fast, half as expensive and has higher rate limits than GPT-4 Turbo, the corporate said.

Currently, language just isn’t a part of the GPT-4o API for all customers. OpenAI notes the danger of misuse and plans to initially roll out support for GPT-4o's latest audio features to “a small group of trusted partners” in the approaching weeks.

GPT-4o is accessible today within the free ChatGPT tier, in addition to for subscribers to OpenAI's ChatGPT Plus and Team premium plans with “5x higher” message limits. (OpenAI notes that ChatGPT will routinely switch to GPT-3.5, an older and fewer powerful model, when users reach the speed limit.) The improved ChatGPT voice experience, based on GPT-4o, shall be available next month or so Alpha version shall be available for Plus users alongside business-oriented options.

In related news, OpenAI announced that it’s releasing an updated ChatGPT interface on the internet with a brand new, more “conversational” home screen and messaging layout, in addition to a desktop version of ChatGPT for macOS that lets users ask questions via a keyboard shortcut or take “and.” can discuss screenshots. ChatGPT Plus users will gain initial access to the app starting today, with a Windows version available later within the yr.

Elsewhere, the GPT Store, OpenAI's library and constructing tools for third-party chatbots powered by its AI models, is now available to users of ChatGPT's free tier. And free users can benefit from ChatGPT features that were previously paid for, resembling a save feature that permits ChatGPT to “remember” preferences for future interactions, upload files and photos, and search the Internet for answers to current questions.

Read more about OpenAI's spring event on TechCrunch

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read