Kolena introduces platform for testing AI models and fine-tuned variants

March 14, 2024

183

For corporations that need to use AI models of their operations – be it for workers or customers – probably the most critical questions isn’t even which model or what they need to use it for, but whether the chosen model may be used safely.

How many backend tests are required? What Types of Tests Should Be Done? After all, most corporations probably need to avoid the form of embarrassing (yet humorous) mishaps we've seen with some automobile dealerships using ChatGPT for customer support, only to search out that users are tricking them into agreeing to a automobile sale for $1. dollars to agree.

Knowing learn how to test models, and particularly fine-tuned versions of AI models, could mean the difference between a successful implementation and one which fails at first glance and costs the corporate its popularity and financial resources. KneeA 3-year-old San Francisco-based startup co-founded by a former Amazon senior engineering manager, today announced the most important release of its AI Quality Platform, an internet application that “provides rapid, accurate testing and validation of AI systems should enable”. .”

The incorporates Monitoring “Data quality, model testing and A/B testing, in addition to monitoring for data drift and model degradation over time.” It also offers debugging.

Screenshot of the Kolena debugging view. Photo credit: Kolena

“We decided to resolve this problem to drive AI adoption in enterprises,” said Mohamed Elgendy, co-founder and CEO of Kolena, in an exclusive video chat interview with Venturebeat.

Elgendy got first-hand insight into the problems corporations face when testing and deploying AI, having previously served as VP of Engineering of the AI platform at Japanese e-commerce giant Rakuten and as Head of Engineering at Machine Learning -Driven X-Ray was machine threat detector Synapse and senior technical manager at Amazon.

This is how Kolena's AI Quality Platform works

Kolena's solution is designed to assist software developers and IT staff construct secure, reliable and fair AI systems for real-world use cases.

By quickly developing detailed test cases from datasets, it makes it easier to accurately test AI/ML models in scenarios they face in the actual world, going beyond aggregate statistical metrics that may obscure a model's performance on critical tasks.

Each Kolena customer connects the model they need to use to their API and provides the client's own data set for his or her AI, in addition to a set of “functional requirements” for a way their model will work when deployed, be it manipulating text , images, etc. Code, audio or other content.

Screenshot of Kolena's quality standards view. Photo credit: Kolena

Additionally, each customer can decide to measure characteristics similar to bias and variety across age, race, ethnicity, and lists of dozens of metrics. Kolena will run tests on the model, simulating tons of or 1000’s of interactions to see whether the model produces undesirable results and, if that’s the case, how often and under what circumstances or conditions.

Additionally, models are retested after they’ve been updated, trained, retrained, refined, or modified by the seller or customer, in addition to in use and deployment.

“It runs tests and tells you exactly where your model has degraded,” Elgendy said. “Kolena takes the guesswork part out of the equation and turns it right into a true engineering discipline like software.”

The ability to check AI systems is beneficial not just for corporations, but in addition for corporations that provide AI models. Elgendy identified that Google's Gemini, which recently faced controversy for producing racially confused and inaccurate images, might need benefited from testing his company's AI Quality Platform before launch.

Two years of closed beta testing with Fortune 500 corporations and startups

True to its ambitions, Kolena isn’t releasing its AI Quality Platform without its own extensive testing of how well it performs when testing other AI models.

The company has been offering the platform to its customers in a closed beta for the last 24 months and evolving it based on their use cases, needs and feedback.

“We intentionally worked with a select group of shoppers who helped us define the list of unknowns and unknown-unknowns,” Elgendy said.

These customers include startups, Fortune 500 corporations, government agencies and AI standards institutes. Elgendy explained.

Collectively, this group of closed beta customers have already run “tens of 1000’s” of tests on AI models using Kolena’s platform.

Going forward, Elgendy said Kolena is searching for customers in three categories: 1. “Builders” of AI foundational models 2. Tech buyers 3. Non-tech buyers – Elgendy stated that one company Kolena worked with had a big Language model provided (LLM) solution that might hook up with fast food drive-ins and take orders. Another goal market: manufacturers of autonomous vehicles.

Screenshot of autonomous vehicle sensor data in Kolena's AI Quality Platform. Photo credit: Kolena.

Kolena's AI Quality Platform pricing relies on a Software-as-a-Service (SaaS) model with three tiers of accelerating pricing designed to trace the expansion of a business with AI, from data quality testing through the training of a model through to completion and deployment.

Kolena introduces platform for testing AI models and fine-tuned variants

This is how Kolena's AI Quality Platform works

Two years of closed beta testing with Fortune 500 corporations and startups

LEAVE A REPLY Cancel reply

Must Read

Model context protocol: a promising AI integration layer, but no standard (still)

When your LLM calls the police: Claude 4's Whistle-Blow and the brand new agent AI risk pile

Business Schools run for locating developments within the AI

Experiment with a generative AI with Kibbitz and Futz with more integrative futures

Can the Golf really change into a Ki superpower?

Akool Live Camera can translate video calls in real time, replace faces and make live virtual avatars imitate human movements

Airbnb fraud: New book examines flourishing criminal activities on large tech platforms

Latest articles

Model context protocol: a promising AI integration layer, but no standard (still)

When your LLM calls the police: Claude 4's Whistle-Blow and the brand new agent AI risk pile

Business Schools run for locating developments within the AI

Our Newsletter

Kolena introduces platform for testing AI models and fine-tuned variants

This is how Kolena's AI Quality Platform works

Two years of closed beta testing with Fortune 500 corporations and startups

RELATED ARTICLES

LEAVE A REPLY Cancel reply

Must Read

Latest articles

Our Newsletter