I Tested Deep Research Tools—They’re not all made equal
I used to get paid for deep research.
Now? AI does it in minutes.
No joke—if I had these tools while working on my PhD, I’d have saved half the time, drank less coffee, and maybe even kept my muscle mass.
AI can now:
- Scan 40+ sources.
- Process them with cutting-edge models.
- Deliver a 10,000-word research report in minutes.
The game has changed.
If you haven't explored the latest Deep Research features, this breakdown is for you.
What is Deep Research (or Deep Search)?
Deep Research was launched around January by Google, followed by OpenAI and Perplexity a couple of weeks later. It’s a new feature that generates research report-level output.
Deep research is an agent that can do research work for you independently—you give it a prompt, and it will find, analyze, and synthesize hundreds of online sources to create a comprehensive report at the level of a research analyst.
Every output is fully documented, with clear citations and a summary of its thinking, making it easy to reference and verify the information. It is particularly effective at finding niche, non-intuitive information that would require browsing numerous websites.
This is very different from the “simple web search” functionality that was already available on AI tools, that just searched for the answer and stop once “a few good enough” answers are found.
Deep Research works by continuously refining its own output through multiple iterations and gathering as much information as possible to thoroughly explore a given topic.
According to Perplexity's press release, it "attains high benchmarks on Humanity's Last Exam," which is the toughest benchmark for LLMs.
When AI outperforms world-class experts across all possible fields, we'll reach a "watershed moment" for humanity.

On the SimpleQA benchmark, which tests for factuality of generated output, Deep Research already outperforms all the other contenders:

Deep Research works much like how humans brainstorm and draft before completing a final paper. (My prediction for the future? AIs that collaborate and peer-review each other's work, though for now they're limited to self-reflection.)
This letter isn't about delving into all the technical details and how Deep Research features work you can test it for free on Perplexity or Grok.
I'd like to focus more on giving you insights on which Deep Research tool to use for what, their real worth, and how to use them.
So I ran an experiment. Like in this previous newsletter, this isn't meant to be empirical testing but rather a qualitative evaluation to give a broad idea of deep search capabilities from different providers:
- Perplexity at $20/mo (start for free)
- Grok-3 at $30/mo (start for free)
- OpenAI at $200/mo (start at $20/mo)
We're not covering Google's Deep Search, which is also a very powerful contender.
Quick heads up: Nobody paid me to write this. I'm just sharing what worked (and what didn't).
Let’s dive in!
How Do These AI Tools Stack Up?
Not all Deep Research tools are built the same. Some are super detailed; others are real-time fact checkers. Here’s what makes each unique:
1. OpenAI’s ChatGPT
- Conversational Q&A: Acts like an expert you can interview, refining your query before diving in.
- Multi-Modal Research: Handles PDFs, Excel sheets, and online articles.
- Data Analysis & Visualizations: Can crunch numbers and generate charts.
→ Best for: Interactive deep dives, structured analysis, and multi-format research.
2. Grok-3 (by xAI)
- X (Twitter) Integration: Pulls real-time data, expert opinions, and breaking news.
- Candid & Direct: More opinionated than other AIs, offering balanced yet straightforward takes.
→ Best for: Keeping up with fast-moving topics, real-time trends, and uncensored insights.
3. Perplexity AI
- Live Web Search + AI Analysis: It reads the internet, processes the info, and gives you an answer.
- No Login Required: Just go to Perplexity.ai and start researching.
- Model Choice: Lets you pick from OpenAI, Grok-3, and open-source LLMs.
→ Best for: Quick, citation-backed research with real-time data.
Putting Them to the Test: Real-World Comparison
I ran a head-to-head experiment with all three tools.
The Challenge: Generate a Deep Research Report on DeepSeek
Prompt:
"Create a short article about 'DeepSeek Demystified: How This Open-Source Chatbot Outpaced Industry Giants.' The article should be deep and actionable. Add code snippets and tutorials if necessary."
I ran each AI through its paces and got some seriously detailed reports back. I’ve run multiple searches (five for each tool with the same prompt). I used the “starter” versions of Perplexity, Grok-3 and OpenAI.
Instead of dumping a wall of text on you, let me break down what I found using four simple criteria:
1. Depth & Detail
The "how deep does it go?" test. Does it give you the full story with all the important context, tech details, and pros/cons? Or just skim the surface?
2. Actionability & Practical Guidance
The "can I actually use this?" test. Does it give you real steps to follow? Better yet, any code snippets or tutorials you can try right away, leading to USEFUL results?
3. Organization & Clarity
This is all about how easy it is to follow along. Are things well-organized with clear headings? Can you find what you need without getting lost?
4. Accuracy & Verifiability
The truth test. Is the information correct (like when it talks about tech specs or costs)? And importantly, does it back up its claims with solid references?
Now, let me share my honest testing experience. While this isn't a rigorous scientific study, I've done my best to give you a practical, real-world comparison of these tools.
Let's Break Down How Each Tool Performed
I evaluated them based on four key aspects that matter most to users like us:
1. Depth & Detail
- OpenAI's Output
- How it did: Excellent.
- Why: Provides extensive historical context, in-depth architectural specifics (such as Mixture-of-Experts, 671B parameters with 37B active), detailed training methodology (including reinforcement learning phases and cost estimates), and a comprehensive comparison with GPT-4/Gemini.

- Grok's Output
- How it did: Decent but not amazing
- Why: Highlights key aspects of DeepSeek’s RL training, popularity timeline, and cost-performance balance. It touches on architecture and provides an overview, but doesn’t delve as deeply as OpenAI’s output.
- Perplexity's Output
- How it did: Good (but slightly shorter than OpenAI’s)
- Why: Covers a broad range of topics—such as how MoE improves efficiency, cost comparisons, and brief mentions of code usage—providing valuable detail without the extensive depth found in OpenAI’s output. Like a well-balanced meal!

2. Practical Usefulness
- OpenAI
- Score: Top-notch
- Why: Offers detailed installation instructions, code snippets, environment setup, and fine-tuning examples, making it highly actionable for users who wish to implement or experiment with the model.

- Grok
- Score: Not bad
- Why: Includes a dedicated tutorial section (e.g., “Tutorial: Running DeepSeek-R1 Locally with Olama”) with code snippets that guide users through local deployment, though it is not as exhaustively detailed as OpenAI’s guide. Gets the job done.
- Perplexity
- Score: Solid
- Why: Presents practical guidance with API usage examples, local setup commands, and a short Python script for fine-tuning. It offers clear instructions for quick integration despite being less elaborate than OpenAI’s output.

3. Easy to Follow?
- OpenAI
- Score: Strong
- Why: Well-organized with clearly defined sections (Introduction, How DeepSeek Works, Key Differentiators, Implementation Guide, etc.), enhanced by headings, bullet points, and bold text for easy navigation.
- Grok
- Score: Good
- Why: Uses a coherent structure with clear headings (Introduction, Technology, Popularity, Implications, Tutorial, Conclusion). The overall flow is easy to follow, even if the content is less extensive.

- Perplexity
- Score: Well done
- Why: Organized with segmented headings (such as Architectural Breakthrough, Mixture-of-Experts, and Open Source as a Disruptor) and bullet points. While some sections transition quickly into new topics, the layout remains clear overall.

4. Can We Trust It?
- OpenAI
- Score: Appears Very Good
- Why: References detailed cost estimates, model architecture, and training tokens with internal consistency. Important papers and sources are all cited (in addition to many interesting new sources I discovered).
- Grok
- Score: Pretty trustworthy
- Why: Mentions training costs, downloads, and performance comparisons with citations from sources like BBC, Wikipedia, and TechCrunch. The data is plausible, even if it lacks the granularity of OpenAI’s output.
- Perplexity
- Score: Generally Good
- Why: Provides multiple citations, external links, and credible cost comparisons (e.g., $42 million vs. GPT-4’s rumored $120 million). Although less extensive in referencing than OpenAI’s output, the overall data appears consistent and verifiable.


Overall Comparison
- OpenAI’s Output stands out for its unparalleled depth and practical detail, making it the most comprehensive and actionable Deep Research tool so far in my opinion. (This is not sponsored by OpenAI)
- Grok’s Output offers a balanced overview with clear structure and a practical tutorial, though it is less detailed than OpenAI’s outputs.
- Perplexity’s Output delivers a concise, decent, well-organized summary with practical code examples and credible data, positioning it as a solid, if slightly less exhaustive, resource.

Alright, done with comparisons ^^
Now, let's talk about what you can actually use deep research for. I've collected a list of use cases that might change your workflow.
A Collection of 13 Use Cases (with Prompts):
1. Learning Anything Faster
Deep Research compiles 100 strategies, groups sources, provides expert insights, and even suggests books.
→ Example: "Find 100 ways to improve my negotiation skills."
2. Academic & Thesis Research
It compiles literature reviews, cross-references peer-reviewed sources, analyzes key debates, and more.
→ Example: "Create a literature review on 'renewable energy innovations' for my thesis."
3. Making Money Online
No get-rich-quick BS. Practical ways to earn based on skills, budgets, and platforms. Tried it, and it's not bad, actually.
→ Example: "Find 100 ways to make money online with a $0 budget in 30 days."
4. Personalized Learning Plans
It builds step-by-step guides (with the necessary references) to master any skill. But the references are not necessarily the best...
→ Example: "Create a 7-week plan to learn Generative AI for business."
5. Mastering Hobbies (Guitar, Gaming...)
Best sources, "meta-strategies," hidden mechanics, and training regimens.
→ Example: "Go from Silver to Diamond in League of Legends."
6. Legal Research
It scans and gathers court rulings, case studies, and expert analysis. This saves thousands for my company each month.
→ Example: "Summarize major intellectual property disputes in tech."
7. Buying the Right Products
It could help to find the best, not just the most advertised (not sure).
→ Example: "Find the safest car seats based on Swiss crash tests."
8. Content & Marketing Strategy
Get trend analysis, case studies, and campaign insights.
→ Example: "Summarize the top social media trends for 2025."
9. Travel Planning
It compares flights, hotels, and itineraries for you. This is how I'm planning all my trips from now on.
→ Example: "Plan a budget surfing trip to Bali in August."
10. Business & Market Research
It breaks down industries, regulations, and trends. I'm using this for my business now.
→ Example: "Analyze federal procurement challenges and strategies for small businesses."
11. Investment & Financial Analysis
Analyze stocks, crypto, and emerging markets with data-backed insights.
→ Example: "Find the top trends in sustainable investing."
(I don't think you should 100% rely on this, but a strong head start)
12. Healthcare & Medical Research
Find treatments, clinical trials, and expert insights in seconds.
→ Example: "Summarize treatment options for [medical condition]."
(see a doctor equipped with knowledge to have better discussions)
13. AI-Powered News Digest
Get daily summaries of the news that matters to you.
→ Example: "Summarize the top AI, business, and finance news today."
In Summary
There you have it, fellow curious minds! We've put the top AI research tools through their paces, and while OpenAI seems to be leading the pack (and no, they're not paying me to say this), each tool has its own special sauce. Grok brings a solid game to the table, and Perplexity isn't far behind either.
We also covered how these tools can supercharge your daily workflow. From planning your next vacation to deep-diving into academic research, I've laid out 13 practical ways you can put these AI assistants to work. Trust me, once you start using them, you'll wonder how you ever managed without them!
Remember: these are tools, not magic wands. Use them wisely, combine them with your own critical thinking, and you'll be amazed at what you can accomplish.
Until next time, keep exploring and stay curious!
– Charafeddine