← Back to Blog

AI SEO EXPERIMENTS

Claude Web Fetch vs Web Search: How Anthropics AI Actually Finds and Cites Content

2026-03-30

Claude Web Fetch vs Web Search: How Anthropics AI Actually Finds and Cites Content

Claude has both web search and web fetch. It is not a fetch-only platform. But it uses these tools so conservatively that it behaves like one. Our server-side verified data shows exactly how Claude's two-tool pipeline works, when it fires, and why Claude cites fewer sources than any other major AI platform.

A widespread misconception about Claude is that it has no search capability, only a page fetcher. That is wrong. Claude has two distinct web access tools: web_search (which queries an external search index) and web_fetch (which performs live HTTP GETs on specific URLs). We verified both through server-side Vercel middleware logs in our claude_livefetch/ experiment series.

The practical distinction matters enormously. Claude is the most conservative citatior of all major AI platforms. It cites in only 39% of queries (versus 56% for ChatGPT, 97% for Perplexity, and 98% for Google AI Mode). Understanding how its two-tool pipeline works, and when each tool fires, is the key to Claude visibility.

For the cross-platform comparison covering all major AI platforms, see AI Citation Behavior Compared. For a practical assessment, try our AI Visibility Quick Check.

🔍 CLAUDE'S TWO-TOOL RETRIEVAL PIPELINE

Claude operates with two separate web access tools, each with distinct behavior verified through server-side logging:

Tool What It Does Server-Side Evidence
web_search Queries an external search index to discover URLs Zero server-side hits during search. Returns URLs from an external index without touching your server.
web_fetch Live HTTP GET with Claude-User/1.0 user agent Confirmed via Vercel middleware logs. 7 out of 7 test pages produced verifiable hits.

This is a two-stage pipeline: web_search discovers candidate URLs from an index, then web_fetch selectively retrieves specific pages from that candidate list. The search tool never hits your server directly. The fetch tool does.

The Brave Connection

Claude's web_search backend is likely Brave Search. Our testing showed 86.7% overlap between Claude's search results and Brave's index. This is significant because Brave maintains a smaller, independently crawled index compared to Google or Bing. The implications show up in the position data:

Platform Position 1 URL Match Rate Distribution Pattern
Perplexity 10.42% Strong position gradient
Google AI Mode 4.17% Moderate position gradient
ChatGPT 2.08% Weak position gradient
Claude Essentially flat No position gradient (positions 1-20 are random)

Claude shows a flat distribution across Google search positions. There is no position gradient at all. This is the strongest evidence that Claude's search backend is neither Google nor Bing. A platform using Google or Bing would show some correlation with Google rankings. Claude shows none, which is consistent with using Brave (a different, smaller index) or relying heavily on training data over search results.

The Bottom Line: Claude can discover new URLs through its search tool. But it searches a different index than ChatGPT (Bing) or Perplexity (index with 49.6% Google overlap). Your Google ranking has zero predictive value for Claude citations.

🤖 CLAUDE-USER BOT: THE SESSION-CACHED FETCHER

When Claude's web_fetch fires, the HTTP request comes from a bot identified as Claude-User. This bot has a unique behavioral pattern: session-level robots.txt caching (Lee, 2026).

How Claude-User Handles robots.txt

Claude-User checks your robots.txt file once at the beginning of a conversation session and caches the result for the entire session. This creates measurable timing behavior in server logs:

Metric Claude-User ChatGPT-User PerplexityBot
robots.txt check Once per session Does not check (ignores) Per-crawl compliance
First fetch timing 3.0s (includes robots.txt check) ~1.0s Background crawl
Subsequent fetch timing 1.1s (cached, no re-check) ~1.0s Background crawl
Compliance model Session-cached Non-compliant since Dec 2025 Fully compliant

The 3.0s vs 1.1s gap on first versus subsequent fetches is a direct artifact of the robots.txt check. The first fetch includes a round-trip to your /robots.txt endpoint. Every fetch after that skips it.

The Bottom Line: Claude-User is the only major AI conversational bot that both respects robots.txt and uses session-level caching. If you block Claude-User in robots.txt, you are invisible for the entire conversation. For the complete robots.txt reference, see robots.txt for AI Bots. For the OpenAI three-bot comparison, see OpenAI's Bots Have a Split Personality.

📊 CLAUDE IS THE MOST CONSERVATIVE CITATIOR

Even though Claude can search, it behaves like a platform that barely wants to. The citation data makes this clear:

Metric Claude ChatGPT Perplexity Google AI Mode
Citation rate (% of queries) 39% 56% 97% 98%
Avg citations per query 2.1 3.5 9.8 17.4
Reddit citations 0% 15.6% 12.1% 8.9%
YouTube citations 0% 0% 47% 53%

Claude cites in fewer than 4 out of 10 queries. When it does cite, it averages just 2.1 sources per response, compared to Perplexity's 9.8 or Google AI Mode's 17.4. It never cites Reddit. It never cites YouTube. This is a platform that reaches for external sources reluctantly and cites a narrow, curated set when it does.

Why Claude Is So Conservative

The demand-driven architecture explains this. Claude evaluates whether its training data is sufficient before activating web tools. For well-covered topics, it answers from memory with no external citations at all. ChatGPT triggers web search at measurable rates by query type (73% for discovery queries, 10% for informational). Claude's threshold is higher. It needs to identify a genuine knowledge gap before it reaches out.

The Bottom Line: Claude's conservatism is not a bug. It is an architectural choice. The practical effect: if your content addresses topics Claude's training data covers well, Claude may never search for or cite you, regardless of how well-optimized your page is.

🎯 CITATION RATES BY QUERY INTENT: THE SURPRISE

The aggregate 39% citation rate hides dramatic variation by query intent. This is where the data gets surprising:

Intent Claude Citation Rate Citations Avg per Query
Discovery ("best X") 95% 102 5.1
Review-seeking 60% 72 3.6
Validation ("is X worth it") 35% 35 1.8
Comparison ("X vs Y") 5% 6 0.3
Informational ("what is X") 0% 0 0

Claude actually cites at a 95% rate for discovery queries. That is higher than ChatGPT's rate for the same intent type. When someone asks Claude "what is the best running shoe," Claude almost always searches, fetches, and cites external sources.

But look at the other end: 0% for informational queries and 5% for comparisons. Claude answers "what is generative engine optimization" entirely from training data. And it essentially refuses to cite sources for head-to-head comparison queries, which is where purchase decisions happen.

The Bottom Line: The optimization question for Claude is not "will Claude cite me?" but "for which query types will Claude cite me?" Discovery queries are viable. Informational and comparison queries are near-zero.

🛒 THE ECOMMERCE PROBLEM: CLAUDE CITES THE REVIEWER, NOT THE PRODUCT

For ecommerce brands, the discovery intent data reveals a specific challenge. Claude does cite at 95% for "best X" queries, but it cites established review sites, not the products themselves.

Top Claude ecommerce citation sources:

  • RTINGS
  • SoundGuys
  • TechRadar
  • Tom's Guide
  • ConsumerReports

Your product page is not what Claude fetches and cites. The review of your product on a Tier 1 review site is what Claude fetches and cites. The path to Claude visibility for an ecommerce brand is:

  1. Get reviewed by the sites Claude trusts (RTINGS, TechRadar, ConsumerReports, Tom's Guide, SoundGuys)
  2. Those review sites are what Claude fetches and cites
  3. Your product gets mentioned as a recommendation within those citations

This is more like traditional PR and earned media than SEO or even GEO. The 5% comparison intent citation rate is the alarming number. Claude essentially refuses to cite sources for "X vs Y" queries, which is where purchase decisions happen.

For the full review site tier system and outreach playbook, see Review Sites That AI Platforms Cite Most.

The Bottom Line: For ecommerce, Claude optimization is an earned media strategy. Get reviewed by Tier 1 review sites. Claude will cite them, and your product gets visibility through those citations.

🧠 HOW CLAUDE EVALUATES FETCHED CONTENT

When Claude does fetch a page, it applies measurable content scoring biases that differ from other platforms:

Signal Claude's Treatment Effect
Pure marketing copy Penalized (0.8x weight) Promotional language reduces citation probability
Limitations / caveats sections Boosted (1.7x weight) Honest disclosure of constraints increases citation probability
Content-to-HTML ratio Positive predictor Higher ratio (0.086 cited vs 0.065 not-cited)
Product schema Strong positive (OR = 3.09) Structured product data increases citation odds
Review schema Strong positive (OR = 2.24) Review content with schema gets cited more
Article schema Negative (OR = 0.76) Decreases citation probability
Reddit content Categorically excluded 0% citation rate across both API and web UI

The 0.8x marketing penalty and 1.7x limitations boost represent Claude's clearest editorial preference. Claude rewards content that acknowledges trade-offs and constraints. This is measurably different from ChatGPT, which shows less sensitivity to promotional tone.

For the full analysis of Reddit's invisible influence across all platforms, see Reddit's Influence on AI Search.

⚠️ THE JAVASCRIPT RENDERING LIMITATION

Claude's web_fetch retrieves raw HTML without executing JavaScript. Client-side rendered pages (React SPAs, Vue, Angular) return an empty shell:

<html>
<head><title>My App</title></head>
<body><div id="root"></div></body>
</html>

No content. No headings. No text. This problem affects Claude, ChatGPT, and Perplexity equally. None execute JavaScript during live fetches. Only Google AI Mode can see client-side rendered content through Googlebot's headless Chrome pipeline.

Test what Claude sees: curl -s https://yoursite.com/page | head -100. If the content is not in the raw HTML, Claude sees nothing. For SSR solutions, see Server Side Rendering for AI Platforms.

The Bottom Line: Server-side rendering is mandatory for Claude visibility. No exceptions.

🛠️ PRACTICAL OPTIMIZATION CHECKLIST FOR CLAUDE

Based on the corrected understanding of Claude's two-tool pipeline:

1. Ensure server-side rendering. Claude's web_fetch cannot execute JavaScript. Test with curl.

2. Allow Claude-User in robots.txt. Claude respects robots.txt with session-level caching. Block ClaudeBot (training) if desired, but allow Claude-User (conversational).

3. Write for balance, not promotion. Claude penalizes marketing copy (0.8x) and boosts limitations sections (1.7x). Include honest trade-offs.

4. Target discovery intent queries. Claude cites at 95% for "best X" queries but 0% for informational. Focus on content types Claude actually cites for.

5. Get reviewed on Tier 1 review sites. For ecommerce, Claude cites the reviewer, not the product. RTINGS, TechRadar, ConsumerReports are the path.

6. Use Product or Review schema. Product schema (OR = 3.09) and Review schema (OR = 2.24) increase citation odds. Avoid relying on Article schema (OR = 0.76).

7. Maximize content-to-HTML ratio. Target 0.08 or higher. Strip unnecessary markup.

8. Don't optimize based on Google rank. Claude's search backend (likely Brave) shows zero correlation with Google positions. Brave indexation and content quality matter more.

❓ FREQUENTLY ASKED QUESTIONS

Does Claude have web search or only web fetch?

Claude has both. The web_search tool queries an external search index (likely Brave, based on 86.7% result overlap) to discover candidate URLs. The web_fetch tool then performs live HTTP GETs on specific URLs with the Claude-User/1.0 user agent. We verified this through server-side logs: web_search produces zero server hits (it queries an index), while web_fetch produces verifiable hits on your server.

Does Claude respect robots.txt?

Yes. Claude-User checks your robots.txt once per session and caches the result. If blocked, every fetch in that session fails. This makes Claude the most compliant major AI conversational bot, unlike ChatGPT-User which has ignored robots.txt since December 2025. See robots.txt for AI Bots.

Why does Claude cite so much less than other platforms?

Claude cites in 39% of queries compared to 97-98% for Perplexity and Google AI Mode. It averages 2.1 citations per response versus 9.8 for Perplexity. The demand-driven architecture means Claude only activates web tools when training data is insufficient, and it has a higher confidence threshold than ChatGPT. For informational queries ("what is X"), Claude's citation rate drops to 0%.

Can new pages be discovered by Claude?

Yes. Unlike the common misconception that Claude only fetches known URLs, the web_search tool allows Claude to discover new pages through its search index. However, because this index is likely Brave (not Google or Bing), pages that rank well on Google may not appear in Claude's search results at all. The flat distribution across Google positions (no position gradient) confirms Claude is not using Google or Bing for discovery.

How should ecommerce brands optimize for Claude?

Claude cites at 95% for "best X" discovery queries but almost exclusively cites established review sites (RTINGS, TechRadar, ConsumerReports, Tom's Guide, SoundGuys), not product pages. The path to Claude visibility for ecommerce is earned media: get your product reviewed on Tier 1 review sites that Claude trusts. Your product gets mentioned within those cited reviews. The 5% comparison intent citation rate means Claude is not a viable channel for head-to-head product comparison content.

📚 REFERENCES

  • Lee, A. (2026). "Query Intent, Not Google Rank: What Best Predicts AI Citation Behavior." Preprint v5. DOI
  • Aggarwal, P., Murahari, V., Rajpurohit, T., Kalyan, A., Narasimhan, K., & Deshpande, A. (2024). "GEO: Generative Engine Optimization." KDD 2024. DOI
  • Sellm (2025). "ChatGPT Citation Analysis." Industry report (400K+ pages analyzed).