Two AI Heads Are Far Better Than One

Arnie Benn
May 23
4 min read

Updated: Jun 11

There's a simple way of working with AI that has consistently delivered higher-level results for me, and I want to share it, along with a very simple description and demonstration.

Pick two AI models. I use Claude and ChatGPT, though the order doesn't matter. Then put them in conversation with each other, mediated by you.

Here's how it works: (I include a worded example at the bottom of the page.)

1. Send your first prompt to AI #1. Give it the appropriate detail and background, and attach any useful files.

2. Copy the entire exchange into AI #2 — your prompt and its response. Include the same background and the same attachments. Then say: "Claude said this. What do you think?"

3. Take AI #2's response back to AI #1: "ChatGPT said: [paste]."

4. Take that reply back to AI #2: "Claude said: [paste]."

5. Keep bouncing back and forth. Interject your own directions, corrections, or new questions whenever the conversation needs steering.

(The need to steer is more important in certain contexts than in others, for example, in science research or law. This is why having AI agents do this amongst themselves without your input and supervision — which is possible now — might lead them to going off in another direction or into error.)

One important refinement for taking this method to the next level: don't just ask the second model to react or improve. Ask it to find weaknesses. What did the first model miss? Where did it overclaim? What assumptions did it make? What would a skeptical reader object to? (I did not do this in developing this blog post, but I do it when I research.) Without this prompt, the second model may default to polite synthesis, and you may lose most of the value.

This matters because when two models agree, that isn't proof. They may share the same blind spots, drawing from overlapping training data and similar conventional patterns. The value of the method isn't that two models magically produce truth — it's that they create productive friction, while you remain the judge of what survives.

Different models really do have different strengths. In the kinds of work I do, ChatGPT often brings more rigor to science and math, while Claude often helps with prose, organization, and certain forms of reasoning. Each catches things the other may miss.

A couple of practical notes:

AI gives mediocre results when you give it mediocre prompts. The same models that produce shallow answers to vague questions will produce genuinely sophisticated work when you give them clear, specific, substantive direction. Be detailed. Share your own thinking. Tell the model where you're trying to go and what you already know.

Also: be thoughtful about what you paste between platforms. The method involves copying full exchanges and attachments, so don't move confidential or sensitive material across systems whose data policies you haven't checked.

The deeper point is this. AI is rarely at its best when asked to generate creativity and novelty in a vacuum. What it's extraordinarily good at is taking your creativity and amplifying it, because it can survey, connect, and recombine a vast surrounding landscape faster than any single person could. Your job is to bring the original thinking. The models' job is to extend it, challenge it, and connect it to everything they know.

This post itself was written using the method I'm describing. One model drafted it, the other critiqued it, the first responded, and I decided what belonged (and edited the final product to more resemble my personal writing style).

That's the point: the models are useful, but the human judgment in the middle is what turns the exchange into better work.

Going To The Next Level:

An alternative approach is to do two responses in a row before bouncing back to the other AI. Specifically, if ChatGPT gives a detailed suggestion about edits, I might immediately prompt it by saying: "Please incorporate your suggestions and do a version that you would approve, and then I will hand it back to the other AI." (This is also a good opportunity for you to add some "steering" feedback of your own.)

I then copy both responses and paste them back to Claude with a "ChatGPT said this."

A practical reason to do this might be that one AI burns through usage tokens faster than the other, or perhaps you only pay a $20/month subscription on one but a $100/month subscription on the other. You therefore get more bang for your buck by doubling up on the AI that is cheaper (or that gives more usage).

Another pragmatic reason to do this might be that one AI handles the material better than the other, so that AI should therefore do the doubling up on the development of that detail or aspect.

To see how I used this method to produce this blog post, from short but specific notes about it, see this attached PDF of the exchange:

PRESS & MEDIA

PRESS & MEDIA

Arnie Benn

BLOG: Science & Human Behavior... mostly

Two AI Heads Are Far Better Than One

Recent Posts