HomeIndustriesAmazon is attempting to transplant Alexa’s “brain” with generative AI

Amazon is attempting to transplant Alexa’s “brain” with generative AI

Amazon is preparing to relaunch its voice-controlled digital assistant Alexa as a man-made intelligence “agent” able to handling practical tasks because the tech giant struggles to resolve challenges complicating the system's AI overhaul have.

The $2.4 trillion company has spent the past two years attempting to redesign Alexa, its conversational system built into 500 million consumer devices worldwide, transplanting the software's “brains” with generative AI.

Rohit Prasad, who leads Amazon's artificial general intelligence (AGI) team, told the Financial Times that the voice assistant would still need to overcome some technical hurdles before launch.

These include solving the issue of “hallucinations” or fabricated answers, response speed or “latency,” and reliability. “Hallucinations have to be near zero,” Prasad said. “It continues to be an open issue within the industry, but we’re working hard on it.”

Amazon executives' vision is to rework Alexa, currently used for a limited number of easy tasks like playing music and setting alarms, into an “agentic” product that acts as a personalised concierge. This can include all the pieces from restaurant suggestions to configuring bedroom lighting based on an individual's sleep cycles.

The redesign of Alexa has been within the works for the reason that launch of OpenAI's ChatGPT, powered by Microsoft, in late 2022. While Microsoft, Google, Meta and others have quickly embedded generative AI into their computing platforms and improved their software services, critics have questioned whether Amazon can resolve its technical and organizational difficulties in time to compete with its rivals.

According to several employees who’ve worked on Amazon's voice assistant teams lately, the hassle was fraught with complications and the results of years of AI research and development.

Several former employees said the long wait for a rollout was largely as a result of the unexpected difficulties related to switching and mixing the simpler, predefined algorithms that underpin Alexa with more powerful but unpredictable large-scale language models.

In response, Amazon said it’s “working hard to enable much more proactive and expert support” from its voice assistant. It added that a technical implementation of this scale right into a live service and a variety of devices utilized by customers world wide was unprecedented and never so simple as overlaying an LLM with the Alexa service.

Prasad, Alexa's former chief architect, said the corporate's release of the corporate's in-house Amazon Nova models last month – led by its AGI team – was motivated partially by the precise requirements for optimal speed, cost and reliability AI-powered applications like Alexa “reach the last mile, which is actually difficult.”

To function as an agent, Alexa's “brain” must give you the chance to call a whole bunch of third-party software and services, Prasad said.

“Sometimes we underestimate what number of services are built into Alexa, and it's an enormous number. These applications receive billions of requests every week. So in case you're attempting to perform reliable and fast actions… . They must give you the chance to do that in a really cost-effective way,” he added.

The complexity arises because Alexa users expect quick answers in addition to a particularly high level of accuracy. Such qualities are at odds with the inherent probabilistic nature of today's generative AI, statistical software that predicts words based on speech and language patterns.

Some former employees also point to difficulties in preserving the Assistant's original characteristics, including its consistency and functionality, while giving it latest generative characteristics corresponding to creativity and free-flowing dialogue.

Because of the more personalized and conversational nature of LLMs, the corporate also plans to rent experts to shape the AI's personality, voice and diction in order that it stays familiar to Alexa users, in line with an individual aware of the matter.

A former senior member of the Alexa team said that while LLMs were very demanding, they carried risks, corresponding to producing answers that were “sometimes completely made up.”

“At the dimensions at which Amazon operates, this might occur over and over a day,” they said, which might damage Amazon’s brand and fame.

In June, Mihail Eric, a former machine learning scientist at Alexa and founding member of the Conversational Modeling Team, said: said publicly that Amazon “dropped the ball” when it became the “clear leader in conversational AI” with Alexa.

Eric said that despite its strong scientific talent and “huge” financial resources, the corporate was “plagued with technical and bureaucratic problems,” suggesting that “the information was poorly annotated” and “the documentation was either non-existent or outdated.” “.

According to 2 former employees who worked on Alexa-related AI, the historic technology underlying the voice assistant was inflexible and difficult to alter quickly, burdened by a clunky and disorganized code base and an engineering team “spread too thin.” became.

The original Alexa software, built on technology from British start-up Evi in ​​2012, was a question-and-answer machine that searched an outlined universe of facts to seek out the fitting answer, corresponding to the weather of the day or a particular query song in your music library.

The latest Alexa uses quite a lot of AI models to acknowledge and translate voice requests and generate responses, in addition to detect policy violations corresponding to detecting inappropriate responses and hallucinations. Developing software to translate between the legacy systems and the brand new AI models was a serious obstacle to the Alexa-LLM integration.

The models include Amazon's in-house software, including the most recent Nova models, in addition to Claude, the AI ​​model from start-up Anthropic, during which Amazon has invested $8 billion within the last 18 months.

“The biggest challenge with AI agents is ensuring they’re protected, reliable and predictable,” Anthropic CEO Dario Amodei told the FT last 12 months.

Agent-like AI software must get to the purpose “where…” . . “People can actually trust within the system,” he added. “Once we get to that time, we are going to release those systems.”

A current worker said more steps are needed, corresponding to adding child safety filters and testing custom integrations with Alexa, corresponding to smart lights and the Ring doorbell.

“The problem is reliability – it’s presupposed to work almost 100% of the time,” the worker added. “That’s why you see us. . . or Apple or Google ship slowly and incrementally.”

Numerous third parties developing “skills” or features for Alexa said they were unsure when the brand new generative AI-enabled device would come to market and the way they could create latest features for it.

“We are waiting for the main points and understanding,” said Thomas Lindgren, co-founder of Swedish content developer Wanderword. “When we began working with them, they were rather more open. . . then they modified over time.”

Another partner said that after an initial period of “pressure” Amazon placed on developers to arrange for the subsequent generation of Alexa, things had quieted down.

An ongoing challenge for Amazon's Alexa team – which was hit by major layoffs in 2023 – is making a living. Figuring out methods to make the assistants “low cost enough to work at scale” might be a giant task, said Jared Roesch, co-founder of generative AI group OctoAI.

Options being discussed include making a latest Alexa subscription service or cutting sales of products and services, a former Alexa worker said.

Prasad said Amazon's goal is to develop quite a lot of AI models that might function “constructing blocks” for quite a lot of applications beyond Alexa.

“We are at all times oriented towards customers and practical AI. We don’t do science for the sake of science,” Prasad said. “We do that… . Delivering customer value and impact, which becomes more essential than ever within the age of generative AI because customers wish to see a return on investment.”

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Must Read