Image by Solen Feyissa, from Unsplash

Contractors Warn New Google Guidelines Could Affect Gemini’s Accuracy On Sensitive Topics

Reading time: 3 min

Last Updated: Dec 20, 2024

Written by Kiara Fabbri Multimedia Journalist
Fact-Checked by Justyn Newman Lead Cybersecurity Editor

A recent shift in internal guidelines at Google has raised concerns over the accuracy of its Gemini AI, particularly when it comes to handling sensitive or highly specialized topics.

In a Rush? Here are the Quick Facts!

Google contractors can no longer skip prompts outside their expertise for Gemini evaluation.
Contractors now rate AI responses they don’t fully understand, noting lack of expertise.
Contractors previously skipped prompts on complex topics like cardiology or rare diseases.

Contractors working on the Gemini project, who are tasked with evaluating the accuracy of AI-generated responses, can no longer skip prompts outside their domain expertise. This change, first reported by TechCrunch, could potentially impact the reliability of information provided by the AI on topics such as healthcare, where precise knowledge is crucial.

TechCrunch notes that previously, contractors at GlobalLogic, an outsourcing firm owned by Hitachi, were previously tasked with evaluating AI responses based on factors like “truthfulness” and allowed to bypass prompts outside their expertise.

For example, if asked to evaluate a technical question about cardiology, a contractor with no scientific background could skip it.

However, under the new guidelines, contractors are now instructed to evaluate responses to all prompts, including those requiring specialized knowledge, and note any areas where they lack expertise, as reported by TechCrunch.

The new rule has led to concerns about the quality of ratings provided for complex topics. Contractors, often without the necessary background, are now tasked with judging AI responses on issues such as rare diseases or advanced mathematics.

One contractor expressed to TechCrunch frustration in internal correspondence, questioning the logic behind eliminating the skip option: “I thought the point of skipping was to increase accuracy by giving it to someone better?”

TechCrunch reports that the updated guidelines allow contractors to skip prompts only in two cases: if the prompt or response is incomplete or contains harmful content that requires special consent for evaluation.

This restriction has raised alarms among those working on Gemini, who worry that the AI could produce inaccurate or misleading information in highly sensitive areas.

TechCrunch reports that Google has not provided a detailed response to the concerns raised by contractors.

However, a spokesperson emphasized to TechCrunch that the company is “constantly working to improve factual accuracy in Gemini.” They further clarified that while raters provide valuable feedback across multiple factors, their ratings do not directly impact the algorithms but are used to gauge overall system performance.

Mashable noted that the report questions the rigor and standards Google claims to apply when testing Gemini for accuracy.

In the “Building responsibly” section of the Gemini 2.0 announcement, Google stated that it is “working with trusted testers and external experts and performing extensive risk assessments and safety and assurance evaluations.”

While there is a reasonable emphasis on evaluating responses for sensitive and harmful content, less attention seems to be given to responses that, while not harmful, are simply inaccurate, as noted by Mashable.

Contractors Warn New Google Guidelines Could Affect Gemini’s Accuracy On Sensitive Topics

We're thrilled you enjoyed our work!

Leave a Comment