The AI OS · Letter #28
March 29, 2025

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best

Image generation isn’t art anymore—it’s workflow. I ran side-by-side tests across real use cases. Here’s where each model delivers (and where it breaks).

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best

Once upon a time, we spent 6 hours in Photoshop just to make a cat look surprised.

That was 2024.

Now?

I can just tell an AI, “Give this cat the face of someone who just saw their code deleted,” and get three memes in 15 seconds.

That’s INCREDIBLE.

It’s a shift in how we create, think, and build.

I've spent a big chunk of this week playing with these AI generation tools with my kids.

I've turned many of our warm family memories into images with that soft Ghibli glow.

I'm thinking of turning this into a short Ghibli-style film for my kids. They loved this!

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Anyway...

Today, I’ll walk you through a breakdown of Google’s Gemini, xAI’s Grok, and OpenAI’s ChatGPT-4o — three of the most powerful tools in image generation and editing today.

But I’m not just comparing features.

I’ll show you what matters:

How these tools fit into your workflow.

Which one is right for what kind of creator.

And where the opportunity is hiding.

AI Image Editing Showdown: The 3 Heavyweights

→ Google Gemini (AI Studio)

→ xAI’s Grok (Aurora model)

→ ChatGPT-4o (OpenAI’s new all-in-one model)

These aren’t just prompt-to-image tools.

They’re collaborative visual assistants and that’s the big difference in comparison to Midjourney for example.

Midjourney is a very powerful image generation (prompt-to-image) tool. However, control over generation is very limited, and edits are very complex (check out my old deep dive on Midjourney here).

Are they all the same?

No.

For me, each image editing tool plays to a different kind of user.

Let’s break it down.

Which One’s for You?

Tool Best For Strengths Limitations
Gemini Developers + Builders API-first, multimodal, super fast Experimental UI, dev environment
Grok Social-first Creators + Memers Dead simple UX, photoreal edits One-shot prompts, limited control
ChatGPT-4o All-purpose Creators & Teams Natural convo flow, precision editing API not open (yet), usage limits

What Changed and Why You Should Care?

The major shift is related to two main breakthroughs that set this apart from the "usual" image generation AI tools:

That's huge.

You don't need layers, masks, or design tools.

You need language.

You need imagination.

And a bit of promptcraft.

Let’s Meet the Contenders

Google Gemini (1.5 Flash & 2.0 Flash)

A tool built for builders.

Runs in Google AI Studio — a playground for devs. You chat with Gemini, give it images, audio, text — it understands all of it (check out my previous letter).

Gemini 2.0 Flash now creates images natively.

And it does so with multi-turn memory.

Example:

“Generate a photo of a horse.”

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

“Now make it black and white, in a field of yellow flowers.”

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Gemini remembers, and edits.

But: it’s currently locked in a developer console. Not quite a plug-and-play app. And the "edit memory" is very limited so far.

If you’re building a product or need AI that works across media types, Gemini is your stack.

xAI Grok + Aurora

This one’s for the chaos agents.

Grok lives inside X (Twitter) — but accessible outside of X. You hit “Edit Image,” upload something, and type what you want. Done.

Simple. Fast. Surprisingly photoreal.

“Generate a sunset image”

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

“Add a horse and make this sunset feel like a happy ending.”

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Result? Warm tones, glowing light.

It feels like Instagram filters on steroids — no technical knowledge needed.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Drawback: no step-by-step edits. No (or poor) memory — so far (it’s going so fast). No way to select parts. If it messes up, you try again.

But if you’re creating viral memes or visual riffs?

Grok is a social weapon.

OpenAI ChatGPT-4o

This is Photoshop via chat — not 100% true, but seriously close.

Upload an image.

Generate one from a prompt.

Click to edit. Draw a box. Describe your change.

It remembers everything and keeps refining. Honestly the outputs are incredibly accurate.

There are many examples currently being shared on the internet. It just made creating ads a fun play. I’ll go from this Apple ad.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

“Replace the tagline with ‘AI for Everyone’”

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

“Now replace the logo with this one (attached). Use an “Inter” font for the text. Remove the website “apple.com" and replace the image with a very productive person using AI . Keep the same vibe and style.”

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Hey, I just created a professional ad in under a minute!

You can go on...

“Make the logo bigger.”

“Add a blue outline.”

“Now place a cat mascot next to it.”

Done, done, done.

You can also just talk to it — no clicks needed.

Want to change a vibe or mood? Ask.

Want infographic-style text overlays? It nails that too.

Right now, it’s the most accessible, powerful and controllable option for creators who don’t code.

UX & Workflow: How It Feels To Use

Here’s how each one fits into daily creative work.

Google Gemini

Feels like talking to a smart assistant inside your app or project. Or you can use it in Google Studio: testing platform for devs.

Works best when paired with:

It’s a builder’s Lego set. But not friendly for casuals yet.

Grok

You’re scrolling X. See a funny photo.

Tap “Edit Image.”

Type: “Put clown makeup on the person.”

Boom. It works (usually).

No sign-up, no software, no learning curve.

It’s great for:

But again — one-shot edits only. No memory, no selections.

ChatGPT-4o

Feels like brainstorming with a designer… who executes instantly.

(I used to pay $20-60/hour for my designers)

Upload a draft.

Say: “Change the background to dark blue.”

Then: “Add a spotlight effect.”

Then: “Make the text pop more.”

You keep iterating without restarting.

Great for:

ChatGPT-4o is the only image editing AI that has a conversational (image and design) memory and "astonishing" capabilities, to be very honest.

It’s the most polished experience for creators who think in words and ideas, not pixels.

However, it's very very slow so far...

Testing the Tools

I ran all three through a set of real-world tasks.

Test 1: Photorealism & Detail

Prompt A: “Busy urban market at sunset. Street vendors. Neon lights.”

Follow-up: “Add a neon sign that says ‘Open 24/7.’”

Prompt B: “Modern smartphone on a desk with reflections.”

Follow-up: “Emphasize metallic finish, add shadow.”

🧪 Results:

Gemini
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Weird characters.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Maintains the highest fidelity between the original image and follow-up edit in this complex setting. Unlike other tools that introduce unintended changes during editing.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Weird reflection on the table.

Grok
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Weird faces.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Not very realistic.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Not bad. The reflections look convincingly realistic (though the phone's positioning defies physics 😄)

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Metallic reflections on the phone are convincingly realistic but it seems the request was interpreted too literally / “naïvely”.

ChatGPT 4o
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

The faces look strange, but the overall composition appears more polished.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Not bad globally. The overall aspect is good, but we can see the "details" limitations (better handled by Midjourney, for example)

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Highly accurate.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

💡 Over-all Qualitative Comparison:

After 20 tests like this… this is how I would synthesize my observations:

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Test 2: Creative Control

Prompt A: “Cartoon robot with a heart in a colorful world.”

Follow-up: “Change robot to blue/green. Add speech bubble: ‘Hello, Future!’”

Prompt B: “Minimalist mug on white background.”

Follow-up: “Replace design with black ‘M’. Add subtle shadow.”

🧪 Results:

Gemini
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
Grok
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

High realism!

ChatGPT 4o
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

💡 Over-all Qualitative Comparison:

Recall the ad example below? When I tried using Grok, here's what I got — a complete mess.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Overall synthesis of my observations after several tests:

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Test 3: Speed

Overall synthesis of my observations after several tests: ChatGPT is noticeably slower compared to other tools, while Gemini Flash demonstrates remarkable speed.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Test 4: Marketing Content

Prompt A: “Header image for tech landing page. Modern, trustworthy. Tech graphics.”

Follow-up: “Add semi-transparent overlay for text.”

Prompt B: “Trendy coffee shop for local ad.”

Follow-up: “Add handwritten ‘Grand Opening’ banner.”

🧪 Results:

Gemini
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

I usually got awkward outputs for this one...

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
Grok
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Grok just kept giving me images like this. Despite trying multiple prompts, I wasn't able to get the results I wanted. For some reason, I kept getting this type of image in all my attempts. Try it and let me know.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Same for Prompt B…

ChatGPT 4o
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Brilliant.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte
I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Wow. This used to cost a lot of money and time. The results you can achieve with a simple prompt are remarkable.

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

Overall synthesis of my observations after several tests:

I Compared ChatGPT, Gemini, and Grok—Here’s What Each One Actually Does Best — Charafeddine Mouzouni | Cohorte

What Should You Use?

Here’s a quick framework:

Need Tool
API access & custom workflows Google Gemini
Fast edits in social content xAI Grok
Edgy, unrestricted content Grok
Controlled multi-step visuals ChatGPT-4o
Everything else (if you’re patient) ChatGPT-4o

Where This Is Going

This shift is bigger than tools.

It’s about how we create.

Here’s what I see happening:

1. Creation and editing merge.

No more generate → export → edit. You just keep evolving the image.

2. Design gets democratized.

Non-designers now make publish-worthy visuals. Power shifts to those with vision, not tools.

3. AI agents take over workflows.

You’ll say “Make a product campaign” and get images, copy, layout, maybe even video.

4. New tools, new businesses.

Smart builders will wrap these models into custom apps. Think niche image editors, auto-meme bots, visual storytellers.

5. Ethics and quality matter more.

Expect questions around deepfakes, bias, and copyright. You’ll need good judgment—and maybe a watermark.

Final Thought

We're living in an era where a single person, armed with AI tools like these, can build and run what used to require an entire company. The "one-person billion-dollar company" isn't science fiction anymore—it's becoming a real possibility.

Think about it: You can now generate professional visuals, write compelling copy, design interfaces, and automate workflows, all from your laptop. What once required a team of designers, copywriters, and developers can now be accomplished by one creative mind with the right AI tools.

Remember:

You’re not here to compete with AI.

You’re here to create with it.

Stay sharp,

Until the next one,

— Charafeddine

New letters now publish on charafeddine.co

Read the latest letters