BW Communities

Google Explains What It Got Wrong With Gemini Image Generation Tool

BW Online Bureau Feb 26, 2024

Amidst mounting criticism surrounding its technology’s (Gemini conversational app’s) image generation feature, Google responded with transparency late last week. On 23 February, 2024, Prabhakar Raghavan, Google’s Senior Vice President for Knowledge & Information, took to the company's official blog to address concerns and shed light on the decision to temporarily suspend the Gemini app's image generation feature.

Raghavan candidly acknowledged the shortcomings of the feature, expressing remorse. “It’s clear that this feature missed the mark,” he wrote.

Raghavan elaborated on the issues faced, highlighting two key shortcomings in the Gemini app’s image generation functionality. “These two things led the model to overcompensate in some cases, and be over-conservative in others, leading to images that were embarrassing and wrong,” he noted.

The Gemini conversational app, formerly known as Bard, introduced the image generation feature three weeks prior. However, the feature quickly garnered criticism for generating inaccurate and offensive images of people. Raghavan acknowledged user feedback, expressing gratitude while apologising for the feature's shortcomings.

“So what went wrong? In short, two things. First, our tuning to ensure that Gemini showed a range of people failed to account for cases that should clearly not show a range. And second, over time, the model became way more cautious than we intended and refused to answer certain prompts entirely — wrongly interpreting some very anodyne prompts as sensitive.” - Prabhakar Raghavan, Google’s Senior Vice President for Knowledge & Information

Operating independently from Google's broader suite, the Gemini app utilised an AI model named Imagen 2 for its image generation capabilities. Raghavan explained, “The Gemini conversational app is a specific product that is separate from Search, our underlying AI models, and our other products.”

Despite efforts to ensure inclusivity and prevent the creation of harmful content, the Gemini app faced unforeseen challenges. Raghavan cited the unintentional overcautiousness of the AI model as a contributing factor to the issue. Consequently, Google made the decision to temporarily pause the image generation of people within the Gemini app.

What Happened: Shortly after the Gemini's image generation tool's launch, online postings began to surface displaying images that raised concerns about historical accuracy. Examples included depictions such as a Black woman depicted among US senators in the 1800s and a Black man wearing a German World War II-era military uniform, as reported by The Verge.

Raghavan stressed on Google’s ambition to rectifying the issue in the blog post, pledging to improve the feature significantly through extensive testing before reintroducing it. He cautioned users regarding the reliability of AI-generated content, particularly in sensitive or rapidly evolving contexts.

“I can’t promise that Gemini won’t occasionally generate embarrassing, inaccurate or offensive results — but I can promise that we will continue to take action whenever we identify an issue. AI is an emerging technology which is helpful in so many ways, with huge potential, and we’re doing our best to roll it out safely and responsibly,” wrote Raghavan.

Meanwhile, Tesla CEO and X-owner Elon Musk took to social media on Sunday to share insights on this issue. Musk disclosed that he had spoken with a Google executive, who indicated that resolving the Gemini issue might require “a few months to fix.”