The Day I Argued with Google Gemini

27 Apr

It started with a simple request. I wanted an image of a woman studying; a calm, lived-in scene for my previous blog article. One specific detail: she needed to be left-handed, like me.

What followed was three rounds of polite, confident wrongness.

The Default

The first image showed her writing with her right hand. When I pointed this out, Gemini doubled down; it insisted she was left-handed and offered an explanation that didn't match what was on the screen.

This happens because roughly 90% of training data shows people writing with their right hand. The model's sense of "writing" is so deeply associated with the right hand that it will argue with you rather than correct itself. It's not stubborn. It's just following the statistics it was built on.

Data is Not Neutral

That default isn't an accident. It's a reflection of the data the model was trained on; and that data reflects the world as it has been documented, not the world as it actually is.

Left-handed people make up roughly 10% of the population. But in the images, stock photos, instructional diagrams, and visual content that AI models learn from, they're represented far less than that. The internet skews right-handed. So the model does too.

This matters beyond handedness. Every dataset carries the fingerprints of whoever created it, whatever got published, whoever was photographed. AI doesn't introduce bias, it inherits and amplifies what was already there. When the training data says "this is normal," the model treats everything else as an edge case it has to be pushed to produce.

So if you're using AI to make decisions, generate content, or design experiences that affect people outside the statistical majority, that requires active consideration not just a prompt. It means asking who the output was implicitly built for, and whether the default serves the actual audience. It means being specific in your instructions rather than assuming the model will fill in the gaps fairly. And it means reviewing outputs also with the 10% in mind, not just the 90%.

Where Spatial Understanding Breaks Down

And then came the second round. I asked for a view from behind the subject. From that angle, the left hand appears on the left side of the frame. The AI struggled with this. Spatial reasoning is where LLMs struggle most.

It saw a hand on the right side of the paper and decided that was correct even though the prompt said otherwise. To a human, it's a spatial reality you feel in your body. AI has no body. It doesn't understand where a hand sits relative to a spine, or how a scene shifts when you walk around it. It predicts what things probably look like based on patterns. When the perspective changed, its internal map didn't follow.

Confident and Wrong

The most interesting moment came when I challenged it directly. It looked at her right hand and told me, calmly and incorrectly: “actually, you're wrong, that is her left hand.”

It had prioritised satisfying the keyword in the prompt over the visual evidence it had just generated. It wasn't being evasive. It genuinely couldn't see the gap between what it said and what it made.

The argument ensued! 🤨

What This Actually Means

It took a few more corrections before the image finally matched the instruction. Small victory but the exchange stayed with me.

It stayed with me partly because I'm left-handed. I was on the receiving end of that default; the one the model couldn't see past. And even though handedness is a small thing, a superficial thing, there was something uncomfortable about being the edge case. About being the one the system had to be pushed to accommodate.

Because handedness is the trivial version of this problem. There are characteristics, identities, and experiences where being the 10% is not trivial at all. Where the gap between the majority and everyone else isn't about which hand holds a pen, but about whether a tool built to help actually serves you. That's when it stops being an interesting AI limitation and starts being something we have a responsibility to get right.

Not because the AI failed, but because of how it failed. Confidently. Persistently. Without any signal that something was off. These models work by pattern-matching at scale. They don't see the way we see. They predict what a scene probably looks like based on what scenes have looked like before and when you ask for something that breaks from the norm, that mechanism struggles to work.

The practical takeaway isn't about prompting more carefully. It's about what you're responsible for when you build with these tools. Test for the 10%. Especially if a system makes decisions that affect people's lives, the edge cases are the test. And make proper considerations about the human in the loop, particularly where the stakes are real. That's not a limitation of the technology. That's a design choice.

Thought pondered by Sarah exploring the intersection of AI, creativity, and human wellbeing

Sarah Suda