Voice AI technology is evolving rapidly and guarantees to remodel business operations from customer support to internal communications.
In recent weeks, OpenAI has launched recent tools to make it easier to create AI voice assistants expanded its expanded voice mode for more paying customers. Microsoft has updated its Copilot AI with enhanced voice capabilities and reasoning features, while Meta has introduced voice AI to its messaging apps.
According to IBM Distinguished Engineer Chris Hay, these advances could “change the best way firms communicate with customers.”
AI language for customer support
Hay sees a dramatic shift in the best way firms of all sizes interact with their customers and manage their operations. He says the democratization of AI-powered communications tools could create unprecedented opportunities for small businesses to compete with larger firms.
“We are entering the age of AI contact centers,” says Hay. “Any corner store can provide the identical level of customer support as a company. This is incredible.”
According to Hay, the secret is developing real-time APIs that enable ultra-low latency communication between humans and AI. This allows for the type of back-and-forth exchanges that folks expect in on a regular basis conversations.
“To have a natural language conversation, the latency of the models must be around 200 milliseconds,” notes Hay. “I don’t wish to wait three seconds…I would like to get a solution quickly.”
New voice AI technology is becoming accessible to developers through APIs offered by firms like OpenAI. “There is a production-grade developer API where anyone can just call the API and construct that functionality themselves, with very limited model knowledge and development knowledge,” says Hay.
The impact might be far-reaching. Hay predicts that a “massive wave of virtual audio assistants” will emerge in the approaching months and years as firms of all sizes adopt the technology. This may lead to more personalized customer support, the emergence of latest AI communications industries, and a shift in jobs toward AI management.
For consumers, the experience could soon develop into indistinguishable from chatting with a human agent. Hay points to recent demonstrations of AI-generated podcasts Google's NotebookLM as evidence of how far technology has advanced.
“If nobody had told me this was AI, I truthfully wouldn’t have believed it,” he says of 1 such demonstration. “The voices are emotional. Now you’re talking to the AI in real time, and it’s going to get even higher.”
AI voices develop into personal within the truest sense of the word
Big tech firms are scrambling to enhance the personalities and skills of their AI assistants. Meta's approach is to introduce distinguished voices for its AI assistant across its messaging platforms. Users can select AI-generated voices based on stars like Awkwafina and Judi Dench.
However, with the promise comes potential risks. Hay acknowledges that the technology might be a boon for fraudsters and fraudsters if it falls into the incorrect hands.
“In the following six months there might be a brand new generation of fraudsters who’ve authentic-sounding voices, who sound just like the podcast hosts you’ve gotten heard, with tone and emotion of their voice,” he warns. “Models that essentially exist to get money out of individuals.” This could eliminate the necessity for traditional red flags like unusual accents or robotic-sounding voices. “It’s hidden,” says Hay.
He compares the situation to a plot point within the Harry Potter novels, where characters need to ask Personal inquiries to confirm an individual's identity. In the actual world, people can have to make use of similar tactics.
“How do I do know I’m talking to my bank,” Hay muses. “How do I do know I’m talking to my daughter asking for money? People can have to get used to having the ability to ask such questions.”
Despite these concerns, Hay stays optimistic concerning the technology's potential. He points out that voice AI could significantly improve accessibility, allowing people to interact with businesses and government services of their native language.
“Think about things like welfare applications, right? And you get all these confusing documents. “Think concerning the ability to call (your care provider) in your native language after which translate things – really complex documents – into a less complicated language that you simply’re more prone to understand.”
AI voice technology is evolving and Hay believes we’re only scratching the surface of possible applications. He envisions a future where AI assistants are seamlessly integrated into wearable devices like that Orion Augmented reality glasses that Meta recently introduced.
“If this real-time API is in my glasses, I can confer with it in real time on the go,” says Hay. “Combined with AR, this might be a game-changer.” Although he acknowledges the moral challenges, including a recent incident Since smart glasses have been capable of immediately recognize people's identities, Hay stays optimistic concerning the prospects of this technology.
“The ethics must be worked out, and ethics are crucial,” he admits. “But I’m optimistic.”
E-Book: How to decide on the correct foundation model