The AI OS · Letter #25
March 6, 2025

I Tested Deep Research Tools—They’re not all made equal

I put ChatGPT, Grok, and Perplexity head-to-head. One outperformed the rest—here’s what I found and how you can use the Deep Research feature.

I Tested Deep Research Tools—They’re not all made equal

I used to get paid for deep research.

Now? AI does it in minutes.

No joke—if I had these tools while working on my PhD, I’d have saved half the time, drank less coffee, and maybe even kept my muscle mass.

AI can now:

The game has changed.

If you haven't explored the latest Deep Research features, this breakdown is for you.

What is Deep Research (or Deep Search)?

Deep Research was launched around January by Google, followed by OpenAI and Perplexity a couple of weeks later. It’s a new feature that generates research report-level output.

Deep research is an agent that can do research work for you independently—you give it a prompt, and it will find, analyze, and synthesize hundreds of online sources to create a comprehensive report at the level of a research analyst.

Every output is fully documented, with clear citations and a summary of its thinking, making it easy to reference and verify the information. It is particularly effective at finding niche, non-intuitive information that would require browsing numerous websites.

This is very different from the “simple web search” functionality that was already available on AI tools, that just searched for the answer and stop once “a few good enough” answers are found.

Deep Research works by continuously refining its own output through multiple iterations and gathering as much information as possible to thoroughly explore a given topic.

According to Perplexity's press release, it "attains high benchmarks on Humanity's Last Exam," which is the toughest benchmark for LLMs.

When AI outperforms world-class experts across all possible fields, we'll reach a "watershed moment" for humanity.

I Tested Deep Research Tools—They’re not all made equal — Charafeddine Mouzouni | Cohorte
Source: Perplexity AI

On the SimpleQA benchmark, which tests for factuality of generated output, Deep Research already outperforms all the other contenders:

I Tested Deep Research Tools—They’re not all made equal — Charafeddine Mouzouni | Cohorte
Source: Perplexity AI.

Deep Research works much like how humans brainstorm and draft before completing a final paper. (My prediction for the future? AIs that collaborate and peer-review each other's work, though for now they're limited to self-reflection.)

This letter isn't about delving into all the technical details and how Deep Research features work you can test it for free on Perplexity or Grok.

I'd like to focus more on giving you insights on which Deep Research tool to use for what, their real worth, and how to use them.

So I ran an experiment. Like in this previous newsletter, this isn't meant to be empirical testing but rather a qualitative evaluation to give a broad idea of deep search capabilities from different providers:

We're not covering Google's Deep Search, which is also a very powerful contender.

Quick heads up: Nobody paid me to write this. I'm just sharing what worked (and what didn't).

Let’s dive in!

How Do These AI Tools Stack Up?

Not all Deep Research tools are built the same. Some are super detailed; others are real-time fact checkers. Here’s what makes each unique:

1. OpenAI’s ChatGPT

→ Best for: Interactive deep dives, structured analysis, and multi-format research.

2. Grok-3 (by xAI)

→ Best for: Keeping up with fast-moving topics, real-time trends, and uncensored insights.

3. Perplexity AI

→ Best for: Quick, citation-backed research with real-time data.

Putting Them to the Test: Real-World Comparison

I ran a head-to-head experiment with all three tools.

The Challenge: Generate a Deep Research Report on DeepSeek

Prompt:

"Create a short article about 'DeepSeek Demystified: How This Open-Source Chatbot Outpaced Industry Giants.' The article should be deep and actionable. Add code snippets and tutorials if necessary."

I ran each AI through its paces and got some seriously detailed reports back. I’ve run multiple searches (five for each tool with the same prompt). I used the “starter” versions of Perplexity, Grok-3 and OpenAI.

Instead of dumping a wall of text on you, let me break down what I found using four simple criteria:

1. Depth & Detail

The "how deep does it go?" test. Does it give you the full story with all the important context, tech details, and pros/cons? Or just skim the surface?

2. Actionability & Practical Guidance

The "can I actually use this?" test. Does it give you real steps to follow? Better yet, any code snippets or tutorials you can try right away, leading to USEFUL results?

3. Organization & Clarity

This is all about how easy it is to follow along. Are things well-organized with clear headings? Can you find what you need without getting lost?

4. Accuracy & Verifiability

The truth test. Is the information correct (like when it talks about tech specs or costs)? And importantly, does it back up its claims with solid references?

Now, let me share my honest testing experience. While this isn't a rigorous scientific study, I've done my best to give you a practical, real-world comparison of these tools.

Let's Break Down How Each Tool Performed

I evaluated them based on four key aspects that matter most to users like us:

1. Depth & Detail
I Tested Deep Research Tools—They’re not all made equal — Charafeddine Mouzouni | Cohorte
2. Practical Usefulness
I Tested Deep Research Tools—They’re not all made equal — Charafeddine Mouzouni | Cohorte

3. Easy to Follow?

I Tested Deep Research Tools—They’re not all made equal — Charafeddine Mouzouni | Cohorte
4. Can We Trust It?
I Tested Deep Research Tools—They’re not all made equal — Charafeddine Mouzouni | Cohorte
Overall Comparison
I Tested Deep Research Tools—They’re not all made equal — Charafeddine Mouzouni | Cohorte

Alright, done with comparisons ^^

Now, let's talk about what you can actually use deep research for. I've collected a list of use cases that might change your workflow.

A Collection of 13 Use Cases (with Prompts):

1. Learning Anything Faster

Deep Research compiles 100 strategies, groups sources, provides expert insights, and even suggests books.

→ Example: "Find 100 ways to improve my negotiation skills."

2. Academic & Thesis Research

It compiles literature reviews, cross-references peer-reviewed sources, analyzes key debates, and more.

→ Example: "Create a literature review on 'renewable energy innovations' for my thesis."

3. Making Money Online

No get-rich-quick BS. Practical ways to earn based on skills, budgets, and platforms. Tried it, and it's not bad, actually.

→ Example: "Find 100 ways to make money online with a $0 budget in 30 days."

4. Personalized Learning Plans

It builds step-by-step guides (with the necessary references) to master any skill. But the references are not necessarily the best...

→ Example: "Create a 7-week plan to learn Generative AI for business."

5. Mastering Hobbies (Guitar, Gaming...)

Best sources, "meta-strategies," hidden mechanics, and training regimens.

→ Example: "Go from Silver to Diamond in League of Legends."

6. Legal Research

It scans and gathers court rulings, case studies, and expert analysis. This saves thousands for my company each month.

→ Example: "Summarize major intellectual property disputes in tech."

7. Buying the Right Products

It could help to find the best, not just the most advertised (not sure).

→ Example: "Find the safest car seats based on Swiss crash tests."

8. Content & Marketing Strategy

Get trend analysis, case studies, and campaign insights.

→ Example: "Summarize the top social media trends for 2025."

9. Travel Planning

It compares flights, hotels, and itineraries for you. This is how I'm planning all my trips from now on.

→ Example: "Plan a budget surfing trip to Bali in August."

10. Business & Market Research

It breaks down industries, regulations, and trends. I'm using this for my business now.

→ Example: "Analyze federal procurement challenges and strategies for small businesses."

11. Investment & Financial Analysis

Analyze stocks, crypto, and emerging markets with data-backed insights.

→ Example: "Find the top trends in sustainable investing."

(I don't think you should 100% rely on this, but a strong head start)

12. Healthcare & Medical Research

Find treatments, clinical trials, and expert insights in seconds.

→ Example: "Summarize treatment options for [medical condition]."

(see a doctor equipped with knowledge to have better discussions)

13. AI-Powered News Digest

Get daily summaries of the news that matters to you.

→ Example: "Summarize the top AI, business, and finance news today."

In Summary

There you have it, fellow curious minds! We've put the top AI research tools through their paces, and while OpenAI seems to be leading the pack (and no, they're not paying me to say this), each tool has its own special sauce. Grok brings a solid game to the table, and Perplexity isn't far behind either.

We also covered how these tools can supercharge your daily workflow. From planning your next vacation to deep-diving into academic research, I've laid out 13 practical ways you can put these AI assistants to work. Trust me, once you start using them, you'll wonder how you ever managed without them!

Remember: these are tools, not magic wands. Use them wisely, combine them with your own critical thinking, and you'll be amazed at what you can accomplish.

Until next time, keep exploring and stay curious!

– Charafeddine

New letters now publish on charafeddine.co

Read the latest letters