Generative AI could seem magical, but behind the event of those systems are legions of employees from corporations like Google, OpenAI and others, so-called “prompt engineers” and analysts who evaluate the accuracy of chatbot output to enhance their AI.
But a brand new internal policy passed on by Google to contractors working on Gemini, seen by TechCrunch, has led to concerns that Gemini might be more vulnerable to leaking inaccurate information to on a regular basis people on highly sensitive topics like healthcare.
To improve Gemini, contractors are working with GlobalLogic, an outsourcing company owned by Hitachiare routinely asked to rate AI-generated answers based on aspects corresponding to “truthfulness.”
These contractors were, until recently, in a position to “skip” certain prompts, thereby opting out of evaluating various AI-written responses to those prompts if the prompt was well beyond their expertise. For example, a contractor might skip a prompt that asked a distinct segment query about cardiology since the contractor had no scientific background.
But last week, GlobalLogic announced a change from Google that will now not allow contractors to skip such prompts, no matter their very own expertise.
Internal correspondence seen by TechCrunch shows that the rules previously stated: “If you do not need critical expertise (e.g., coding, math) to guage this prompt, please skip this task.”
But now the rules say, “You shouldn’t skip prompts that require specific domain knowledge.” Instead, contractors are asked to “evaluate the portions of the prompt that you just understand” and add a note that they do not need domain knowledge.
This has raised direct concerns about Gemini's accuracy on certain topics, as contractors are sometimes tasked with evaluating highly technical AI responses on topics corresponding to rare diseases that they haven’t any experience with.
“I believed the purpose of skipping was to extend accuracy by giving it to someone higher?” a contractor mentioned in internal correspondence, seen by TechCrunch.
Contractors can now only skip prompts in two cases: in the event that they “completely lack information corresponding to the complete prompt or response” or in the event that they contain harmful content that requires special consent forms to guage, in line with the brand new guidelines.
Google didn’t reply to TechCrunch's requests for comment as of press time. After publishing this story, Google, which didn’t dispute our reporting, told TechCrunch that the corporate is “constantly working to enhance the factual accuracy of Gemini.”
“Assessors perform a wide selection of tasks across many alternative Google products and platforms,” said Google spokeswoman Shira McNamara. “They not only review answers for content, but additionally provide beneficial feedback on style, format and other aspects. The rankings they supply do circuitously impact our algorithms, but taken as an entire they’re a useful data point to assist us measure how well our systems are performing.”