A failed universal language explains why you keep picking the wrong AI output
The case for working with the differences between AI models instead of hunting for the single best output, told through two old stories about language.
I read an article in Harperâs this month about the history of Esperanto, a language built with the hope of ending all language problems.
Ludwik Zamenhof built it in a Warsaw apartment in the 1880s and was convinced that collaboration and peace and progress would inevitably follow if everyone would speak the same tongue.
The end of miscommunication, essentially.

Here we are - it obviously didnât work. He, like many of us, was so certain that the different languages were the obstacle, that the friction between them was what kept humanity from building together.
The Tower of Babel ran on the same assumption: one language, one tower, one shared ambition, then the scattering and languages fragmenting. The standard reading treats diversity of tongues as punishment, the cost of reaching too high.
Before Babel, everyone thought in the same structures, the same assumptions about what really mattered. After Babel, languages diverged and, with them, everything else.
Different grammars encoded different logics, different cultures built entirely different ways of seeing, and the scattering ended up being the engine of human civilization.
Somewhere in the middle of that Harperâs piece and re-reading the story of Babel, I got brought back to how we work with AI and how we treat different models as different languages.
And what happens when we stop picking the best output and instead, start working with the differences between them has changed how I create and build almost everything.
âââ ââ ââ â âââ
Hi, Iâm Mia. Welcome to ROBOTS ATE MY HOMEWORK. On this side of the world, we use AI with a brain and zero circus tricks.
Zamenhof thought one language would end all problems. Turns out the scattering was the point.
The Esperanto instinct and why it costs you your best ideas
We all have a go-to AI model, but sometimes, we try a second one to compare outputs. We read both outputs and pick whichever sounds more appealing, more like what we wanted, and then close the tab.
This is the Esperanto instinct, the assumption that the job is finding which model gets closest to the one correct version of our original ideas.
Languages donât really work that way, though, as languages donât just give you different words for the same thought. They give you different paths of least resistance for organizing thought. Some connections form easily, and others you have to push against the grammar to make.
Two people describing the same event in different languages will emphasize and speak about different details, foregrounding different elements of what happened. They CAN say the same thing, but the paths of least resistance sometimes pull them toward other versions of the story.
AI models work the same way.
I refuse to be the person who tells you Claude is âthe careful oneâ and GPT is âthe confident oneâ and Grok is whatever label the internet is using this week. I do hate labeling them the same way I hate labeling people (itâs reductive and it stops you from paying attention).
Youâve probably worked with enough models yourself to know what I mean. You know how one reaches for structure vs another for narrative, or how one frames a brief around risk and another around opportunity.
Those are different âgrammarsâ for thinking about the same problem - your problem, and every single time you pick the output that sounds most like what you already had in mind, youâre choosing one grammar and discarding the rest.
The AI output versions youâre discarding are creative material.
Each model that interpreted your input differently was showing you a version of your idea you hadnât considered, from an angle you wouldnât have found by iterating inside the model you already trust.
Those varied interpretations are new thinking and working with them deliberately (instead of evaluating them competitively) changes what you can build today.
Thinking in more than one language at once
Case in point: I was working on a content piece on how people talk about AI in cultural commentary, analyzing different magazines and discourses. I ran my notes (2300 words of them) through three models: Claude Opus 4.8, GPT 5.5 and Qwen 3.7 Plus.
Claude assumed I was doing heavy media criticism and gave me a summary of talking points, like who was saying what, which publications were pushing which narrative. It had the shape and form of a PR report and was the most verbose and fluffy.
GPT assumed I was doing cultural analysis, so it made connections to broader patterns about how we talk about new, emerging technology and what anxieties show up when the dynamic changes.
Qwen assumed I was building a unique position and kept asking me what I was really arguing for (full disclosure, I use Qwen mainly through my Hermes agent, so it has the most context about my work, hence the endless question about my real goals).
I wasnât sure myself either, to be honest. I was in âexplorationâ mode.
Iâve since run this test on quite a number of projects, and I realized that the models donât disagree on the facts, but on what the facts are about.
When you run the same idea through different AI models or custom setups, youâll sometimes get different interpretations of what your idea is about. Thatâs because each LLM reads your input through its own patterns. The differences between those readings uncover certain angles (some more surprising than others) of your problem.
Today, âuse multiple modelsâ is the standard advice, advice thatâs been propagating for years and it usually means âcompare outputs and pick the best oneâ.
Could we benefit from reframing this from picking the best one to working with all of them instead?
Yes we can. Everything interesting about multiplicity happens when you start deeply reading into the outputs rather than comparing outputs.
If you do this, youâll start noticing something in the differences between them.
Each LLM makes different assumptions about what your idea is about. One model might assume youâre making a strategic argument, another that youâre telling a really nice story, a third might assume that what you want is to build a workflow.
I built a framework for different AI thinking modes in here:
Your input said NOTHING about any of that. Each model filled in the blanks with its own âgrammarâ and the differences between them showed you the different shapes of your idea.
Hereâs how to run this yourself.
Quick exercise.
Pick a project youâre working on right now, something youâd normally just run through your go-to model and iterate until it feels polished enough to ship and use.
Write a prompt that describes what you want and use the same input with every model, word for word. Donât adjust for each modelâs strengths, or what you think those strengths are. The point is that identical input leads to different interpretations.
Run it through three different models and read the outputs together.
Hereâs what to look for:
What assumption did each model make about what your idea is really about. Were any of those assumptions closer to what you meant than your own original framing?
Who was each model talking to? Did the audience differences surprise you?
What did one model foreground that the others treated as invisible? Was that emphasis useful?
Did any version show you a way into your own idea that you preferred over what you started with?
Just for once, donât pick the best output. Read what the differences between them tell you about the idea youâre building.
What to do with the differences:
Use them as decision points. If one model assumed youâre writing for experts and another for beginners, that shows you havenât really decided about your audience. Thatâs important information.
Use them as a diagnostic. If one model showed risk and another opportunity, then you were pretty ambiguous in your input. You werenât clear enough. Thatâs when you know your thinking is still unresolved and you still have to decide what direction to go in.
Use them as a map of the problem space. Each model created a different map - strategy, narrative, framework, critique, whatever. The differences between these territories show you the full shape of what your idea is. You mightâve been working inside one territory this whole time, and now you can see the others and decide whether to stay or move.
Use them to find load-bearing questions. Sometimes the models will each grab different versions of what you try to do, and youâll realize you havenât really decided what you want to do yet. Thatâs your discovery phase in plain sight and itâs time you figure out what youâre asking.
Use them to write a better prompt. Once you see what each of the models assumed, you can go back and get sharper about what you really meant. Not âpick the best outputâ but ânow I know what I was trying to sayâ.
The FOUR WINDS agent in RobotsOS does exactly this: it pressure-tests your work from four different angles and shows you where they collide.
Full disclaimer that Iâm not claiming and will never agree with the discourse that AI gives you the answer. No - it just sometimes can be pretty effective at showing you what the question was from the start.
The Tower of Babel was abandoned and languages split and from that scattering, humanity got more ways to think, more grammars for organizing reality, more ways to describe the same thing and mean something entirely different by it.
Our AI models are those languages, and they think differently from each other. Those differences are worth working with, rather than completely resolving.
Tomorrow, take something youâre working on right now and run it through the above test.
Notice how each version assumes a different audience, a different priority, a different version of what your idea is. Read what the differences between them tell you about the thing youâre creating.
Learn to think in more than one language at a time.
Whatâs one idea youâre working on right now that would benefit from running it through different AI languages?
đ¤ You just read ROBOTS ATE MY HOMEWORK.
If this made you want to try something new: subscribe for free and get the weekly essays, frameworks, and systems for using AI with taste.
If youâre already here and want the full toolkit: PREMIUM ROBOTS gives you the working skills, agents, and playbooks behind pieces like this one.
If you just arrived: welcome. Start here, look around, and if something here makes you want to use AI more deliberately, not less, youâre in the right place.
To the scattered tongues, still building,
Mia Kiraki,
Chief đ¤ at ROBOTS ATE MY HOMEWORK








I do this and it blows my mind. Itâs what gives me my best stuff, my deepest thinking.
I also do the same with PowerPoints⌠so many varieties gives me a plethora of options to get my point across.
I use three AIs daily⌠back and forth I go. Again, your posts are always on point. Happy to be a paid subscriber!
This was a great read Mia! I've been using different models and asking one to critique others answer... AI isn't a single thing. It's like talking to different people to get their take on the same question! each brings a different perspective, and sometimes the most useful insights come from comparing them.