OpenAI introduces Realtime API and other features for developers

October 2, 2024

204

OpenAI didn't release any recent models at its Dev Day event, but recent API features will delight developers who need to use their models to construct powerful apps.

OpenAI has had a difficult few weeks as its CTO Mira Murati and other senior researchers joined the ever-growing list of former employees. The company is under increasing pressure from other flagship models, including open source models, that provide developers cheaper and more powerful options.

New features introduced by OpenAI included real-time API (in beta), image processing fine-tuning, and efficiency-enhancing tools like easy caching and model distillation.

Real-time API

The Real-time API is probably the most exciting recent feature, albeit in beta. It enables developers to create low-latency speech-to-speech experiences of their apps without using separate speech recognition and text-to-speech conversion models.

With this API, developers can now construct apps that enable real-time conversations with AI, equivalent to voice assistants or language learning tools, all through a single API call. It's not quite the seamless experience that GPT-4o's Enhanced Voice Mode offers, however it comes close.

However, at around $0.06 per minute of audio input and $0.24 per minute of audio output, it isn't low-cost.

The recent real-time API from OpenAI is incredible…

Watch as you order 400 strawberries by actually calling the shop with Twillio. Everything with voice. pic.twitter.com/J2BBoL9yFv

– Ty (@FieroTy) October 1, 2024

Fine-tuning vision

By fine-tuning image processing inside the API, developers can improve their models' ability to know and interact with images. By fine-tuning GPT-4o using images, developers can create applications that excel at tasks like visual search or object detection.

This feature is already getting used by firms equivalent to Grab, which has improved the accuracy of its mapping service by fine-tuning the model to acknowledge traffic signs from street-level images.

OpenAI also gave an example of how GPT-4o could generate additional content for an internet site after customizing it to stylistically match the positioning's existing content.

Instant caching

To improve cost efficiency, OpenAI introduced Prompt Caching, a tool that reduces the associated fee and latency of commonly used API calls. By reusing recently processed input, developers can reduce costs by 50% and improve response times. This feature is especially useful for applications that require long conversations or repeated context, equivalent to chatbots and customer support tools.

Using cached input could save as much as 50% of input token cost.

Pricing comparison of cached and uncached input tokens for the OpenAI API. Source: OpenAI

Model distillation

Model distillation allows developers to optimize smaller, less expensive models while leveraging the outcomes of larger, more powerful models. This is crucial because distillation previously required multiple independent steps and tools, making it a time-consuming and error-prone process.

Before OpenAI's built-in model distillation feature, developers needed to manually orchestrate various parts of the method, equivalent to: Such as generating data from larger models, preparing fine-tuning datasets, and measuring performance with various tools.

Developers can now routinely save output pairs from larger models like GPT-4o and use these pairs to fine-tune smaller models like GPT-4o-mini. The entire technique of data set creation, fine-tuning and evaluation could be more structured, automated and efficient.

The streamlined developer process, lower latency, and reduced costs make OpenAI's GPT-4o model a lovely prospect for developers seeking to deliver high-performance apps quickly. It shall be interesting to see what applications the multimodal capabilities enable.

OpenAI introduces Realtime API and other features for developers

Real-time API

Fine-tuning vision

Instant caching

Model distillation

LEAVE A REPLY Cancel reply

Must Read

Epic games reveal the state of the unreality for 2025

Meta agreed 20 years to purchase production from the Illinois atomic power plant

We have asked over 8,700 people in 6 countries to take into consideration future generations in decision -making, and we’ve found that

Car and chip makers form a gaggle to develop an open in-car connectivity

The US government has to win essentially the most from “Openai for countries”.

Google starts the AI Edge Gallery quietly and lets Android telephones perform Ki with no cloud

Enterprise Alert: Postgresql has just turn into a database that you simply cannot ignore for AI applications

Latest articles

Epic games reveal the state of the unreality for 2025

Meta agreed 20 years to purchase production from the Illinois atomic power plant

We have asked over 8,700 people in 6 countries to take into consideration future generations in decision -making, and we’ve found that

Our Newsletter

OpenAI introduces Realtime API and other features for developers

Real-time API

Fine-tuning vision

Instant caching

Model distillation

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter