OpenAI on Thursday unveiled GPT-4o mini, its latest small AI model. The company says GPT-4o miniwhich is cheaper and faster than OpenAI's current state-of-the-art AI models, is being released to developers and consumers via the ChatGPT web and mobile app starting today, with enterprise users getting access next week.
The company says GPT-4o mini outperforms industry-leading small AI models on text and image reasoning tasks. As small AI models improve, they have gotten increasingly popular with developers resulting from their speed and cost-effectiveness in comparison with larger models like GPT-4 Omni or Claude 3.5 Sonnet. They are a useful option for large-scale, easy tasks that developers may repeatedly call upon an AI model to perform.
GPT-4o mini will replace GPT-3.5 Turbo because the smallest model OpenAI offers. The company claims its latest AI model scores 82% on MMLU, a benchmark measuring reasoning ability, in comparison with 79% on Gemini 1.5 Flash and 75% on Claude 3 Haiku, in line with data from Artificial Analysis. On MGSM, which measures mathematical reasoning, GPT-4o mini scored 87%, in comparison with 78% for Flash and 72% for Haiku.
In addition, OpenAI says GPT-4o mini is significantly cheaper to run than its predecessors, and greater than 60% cheaper than GPT-3.5 Turbo. Today, GPT-4o mini supports text and vision within the API, and OpenAI says the model will support video and audio capabilities in the longer term.
“For every corner of the world to be empowered by AI, we’d like to make the models much cheaper,” said Olivier Godement, head of product API at OpenAI, in an interview with TechCrunch. “I feel GPT-4o mini is a extremely big step forward in that direction.”
For developers constructing on OpenAI's API, GPT4o mini costs 15 cents per million input tokens and 60 cents per million output tokens. The model has a context window of 128,000 tokens, roughly the length of a book, and a knowledge inference of October 2023.
OpenAI declined to disclose exactly how big GPT-4o mini is, but said it’s roughly on par with other small AI models reminiscent of Llama 3 8b, Claude Haiku and Gemini 1.5 Flash. However, the corporate claims GPT-4o mini is quicker, more cost-efficient and smarter than industry-leading small models based on pre-launch testing within the chatbot section of LMSYS.org. Early independent testing seems to verify this.
“Compared to comparable models, GPT-4o mini could be very fast, with a median issuance speed of 202 tokens per second,” said George Cameron, co-founder of Artificial Analysis, in an email to TechCrunch. “This is greater than twice as fast as GPT-4o and GPT-3.5 Turbo and represents a compelling proposition for speed-sensitive use cases, including many consumer applications and agentic approaches to using LLMs.”
Separately, OpenAI announced recent tools for enterprise customers on Thursday. In a blog entryOpenAI announced the Enterprise Compliance API to assist firms in highly regulated industries reminiscent of finance, healthcare, legal, and government comply with logging and auditing requirements.
The company says these tools will allow administrators to audit their ChatGPT Enterprise data and take appropriate motion. The API will provide records of time-stamped interactions, including conversations, uploaded files, workspace users, and more.
OpenAI can also be giving admins more granular control over Workspace GPTs, a custom version of ChatGPT built for specific business use cases. Previously, admins could only completely allow or block GPT actions created of their workspace, but now workspace owners can create an approved list of domains that GPTs can interact with.