Amazon Web Services (AWS), Amazon's cloud computing division, is launching a brand new tool to combat hallucinations – scenarios by which an AI model behaves unreliably.
The Automated Reasoning Checks service, announced on the AWS re:Invent 2024 conference in Las Vegas, validates a model's answers by comparing customer-provided information for accuracy. AWS claims in a press release that automated reasoning checks are the “first” and “only” protection against hallucinations.
But that's, well… putting it generously.
Automated reasoning checks are nearly equivalent to the proofreading feature Microsoft introduced this summer, which also flags AI-generated text that could be factually incorrect. Google also offers a tool called Vertex AI, its AI development platform, that enables customers to “ground” models using third-party data, their very own datasets, or Google Search.
In each case, automated reasoning checks are performed, available through AWS. bedrock The model hosting service (specifically the Guardrails tool) attempts to determine how a model arrived at a solution – and to detect whether the reply is correct. Customers upload information to ascertain some type of ground truth, and Automated Reasoning checks and creates rules that may then be refined and applied to a model.
As a model generates answers, they’re checked through automatic reasoning checks and, within the case of a probable hallucination, the proper answer is decided based on the bottom truth. This answer is presented together with the likely falsehood in order that customers can see how far the model could also be off base.
According to AWS, PwC is already using automated reasoning checks to develop AI assistants for its customers. And Swami Sivasubramanian, vice chairman of AI and data at AWS, said these sorts of tools are exactly what attracts customers to Bedrock.
“With the introduction of those recent capabilities,” he said in an announcement, “we’re innovating on behalf of our customers to resolve among the biggest challenges facing your complete industry in moving generative AI applications into production.” Bedrock's customer base has grown 4.7-fold within the last yr to tens of 1000’s of consumers, Sivasubramanian added.
But as one expert told me this summer, attempting to eliminate hallucinations through generative AI is like attempting to remove hydrogen from water.
AI models hallucinate because they don’t actually “know” anything. These are statistical systems that recognize patterns in a set of information and predict which data will come next based on previously seen examples. It follows that a model's answers will not be answers, but predictions about how questions shall be answered – inside one Error rate.
AWS claims that automated reasoning checkers use “logically correct” and “testable arguments” to succeed in their conclusions. However, the corporate has not provided any data showing that the tool itself is reliable.
In other Bedrock news this morning, AWS announced Model Distillation, a tool for transferring the capabilities of a big model (e.g. Llama 405B) to a small model (e.g. Llama 8B) that’s cheaper and faster to operate . A response to Microsoft's Distillation in Azure AI FoundryAccording to AWS, model distillation provides a option to experiment with different models without breaking the bank.
“After the client provides sample prompts, Amazon Bedrock does all of the work of generating responses and refining the smaller model,” AWS explained in a blog post, “and might even create additional sample data if needed to finish the distillation process .”
But there are a couple of caveats.
Model Distillation currently only works with Anthropic and Meta models hosted by Bedrock. Customers must select a big and a small model from the identical model family – the models can’t be from different suppliers. And distilled models will lose some accuracy – “lower than 2%,” AWS claims.
If all that doesn't put you off, Model Distillation is now available in preview, together with Automated Reasoning checks.
Also available in preview is Multi-Agent Collaboration, a brand new Bedrock feature that enables customers to assign subtasks to AI inside a bigger project. As a part of Bedrock Agents, AWS's contribution to the AI ​​agent craze, the multi-agent collaboration provides tools for constructing and optimizing AI for things like reviewing financial records and assessing global trends.
Customers may even designate a “supervisor agent” to mechanically resolve and route tasks to the AIs. The manager can “grant specific agents access to the knowledge they need to finish their work,” AWS says, and “determine” which actions could be processed in parallel and which require details from other tasks before (an) agent can proceed forward.”
“Once all specialized (AIs) have accomplished their inputs, the supervisor agent can merge the knowledge (and) synthesize the outcomes,” AWS wrote within the post.
Sounds fancy. But as with all of those features, we'll must see how well it really works when utilized in the actual world.