← Back to Blog

AI SEO Experiments

How Often Does ChatGPT Trigger a Web Search? 391 Queries Exposed the Pattern

2026-01-20

How Often Does ChatGPT Trigger a Web Search? 391 Queries Exposed the Pattern

ChatGPT triggers a web search for only 42% of brand and product queries. The other 58% are answered entirely from training data. Intent type — not topic, not keywords, not phrasing — is the primary switch that determines whether ChatGPT searches the web. Discovery queries ("best CRM for small business") trigger search 73% of the time. Informational queries ("What is a CRM?") trigger search just 10% of the time.

These numbers come from testing 400 queries through ChatGPT's actual web UI, not the API. When ChatGPT does not search, your content cannot be cited. No search means no citation — your SEO, your landing pages, and your optimized content are irrelevant for that query.

This post breaks down exactly when ChatGPT decides to search, what determines that decision, and how to position your content for the queries where it actually looks.

Why This Matters for AI Visibility

If you are investing in generative engine optimization (GEO), you need to understand the gatekeeper: ChatGPT's decision to search or not search.

When ChatGPT searches the web, it fires sub-queries, retrieves pages, and cites sources in its response. When it does not search, it pulls from training data that could be months old. Your recently published content, your updated pricing page, your latest case study — none of it exists if ChatGPT decides it already knows the answer.

The question is not "how do I rank in ChatGPT?" The question is: "will ChatGPT even look?"

How We Tested ChatGPT's Search Behavior

We ran two experiments. The first was small and directional. The second was large enough to draw real conclusions.

The First Experiment: 24 API Queries (January 2026)

In January 2026, we tested 24 prompts through the OpenAI Responses API across four dimensions: recency, brand familiarity, specificity, and comparison. Each dimension had paired prompts — one designed to trigger search, one designed not to.

Key findings from that test: brand familiarity acted as a clean binary signal (unknown brands always triggered search), comparison queries triggered search 100% of the time, and recency words ("right now," "2025") did not matter because ChatGPT injected its own time context automatically.

Some of these held up. One did not.

The Second Experiment: 400 UI Queries (March 2026)

To validate and extend those findings, we ran 400 queries through ChatGPT's actual web UI — not the API. This matters because the API and the UI behave differently, as we will show.

Parameter First Experiment Second Experiment
Queries tested 24 400 (391 valid)
Method OpenAI Responses API ChatGPT UI via Playwright
Detection method API tool_use events SSE stream fan-out queries
Intent types 4 paired dimensions 5 classified intents
Verticals SaaS only 8 verticals
Query sources Manually written Google "People Also Asked" + Autocomplete

We scraped ChatGPT's web UI using Playwright via the Chrome DevTools Protocol, capturing the Server-Sent Events (SSE) stream in real time. This let us detect the fan-out search queries ChatGPT fires internally — the actual sub-queries it sends to its search index — rather than relying on whether citations appeared in the final response.

Over 100 of the 400 queries were sourced directly from real Google "People Also Asked" suggestions. The rest came from Google Autocomplete. Queries were classified across 5 intent types (discovery, review-seeking, validation, comparison, informational) and 8 verticals (SaaS products, SaaS brands, supplements, law, social media, agency, automation, marketing).

391 queries returned valid responses. 9 errored out (2.2% failure rate).

ChatGPT Searches Only 42% of the Time

Out of 391 valid brand and product queries, 166 triggered a web search. The remaining 225 — 58% — were answered entirely from training data.

But the 42% average masks the real story. The overall rate is not useful for strategy. What matters is which queries trigger search and which do not. The answer is intent.

Intent Type Is the Real Search Trigger

This is the core finding. The type of intent behind a query predicts whether ChatGPT will search far more reliably than the topic, the vertical, or the specific words used.

Intent Type Search Rate Example Query
Discovery 73% "best CRM for small business"
Review-seeking 58% "best service desk software reddit"
Validation 44% "Is Mailchimp good for beginners?"
Comparison 29% "Salesforce vs HubSpot"
Informational 10% "What is a CRM?"

Discovery queries — where users are actively choosing a product or service — triggered search 7.3 times more often than informational queries. The pattern is consistent: when users are choosing, ChatGPT searches. When users are learning, it does not.

This has direct implications for content strategy. If your content targets "what is X" questions, ChatGPT almost never searches for them. You are invisible for those queries regardless of how well your page is optimized. The citation opportunity is overwhelmingly on discovery and review-seeking queries.

The Comparison Surprise: API vs. UI Tell Different Stories

In our first 24-query API experiment, comparison queries ("X vs Y") triggered search 100% of the time — every single one. Based on that data, we recommended building comparison pages as a primary AI citation strategy.

The 400-query UI experiment told a different story: comparison queries triggered search only 29% of the time.

Why the discrepancy? Two factors:

1. API vs. UI behavior differs. The OpenAI Responses API and the ChatGPT web UI use different search pipelines. The API was more aggressive about triggering search for comparisons. The UI, which is what actual users interact with, is not.

2. Brand familiarity matters within comparisons. Most comparison queries in the larger dataset involved brands ChatGPT already knows well. "Salesforce vs HubSpot" gets answered from training data. "Slack vs Teams" gets answered from training data. "Creatine vs whey protein" gets answered from training data.

The comparisons that did trigger search consistently involved at least one lesser-known brand: "n8n vs Make," "Klaviyo vs Shopify Email," "Pumble vs Slack."

The correction: Comparison content still matters, but it is not the guaranteed search trigger we initially reported. If you are comparing two household names, ChatGPT may not even look at your page. The opportunity is in comparisons where your brand is the lesser-known entity — which, for most businesses reading this, is exactly the situation you are in.

Search Trigger Rates by Industry

The vertical your content targets also influences search trigger rates, though less dramatically than intent.

Vertical Search Rate
SaaS products (category queries) 59%
Law 55%
Social media 46%
Supplements 41%
Agency 41%
SaaS brands (named products) 39%
Automation 35%
Marketing 20%

Two patterns emerge. First, the more commoditized and crowded the category, the more ChatGPT searches. SaaS products (59%) have thousands of competitors; ChatGPT cannot confidently recommend from training data alone. Marketing (20%) has fewer clear-cut product choices, so ChatGPT feels more confident answering from memory.

Second, the split between SaaS product queries and SaaS brand queries is telling. Category queries ("best email marketing platform") triggered search 59% of the time. Brand queries ("What is Mailchimp?") triggered search only 39% of the time. Category queries trigger 50% more searches than brand-specific queries. Your biggest citation opportunity is not your branded queries — it is the generic "best [category]" queries where ChatGPT is actively searching.

What Happens When ChatGPT Does Search

When ChatGPT triggers a search, its behavior is remarkably consistent.

Metric With Search Without Search
Sub-queries fired 2 (92% of cases) 0
Median citations 6 0
Average response length 2,894 characters 1,992 characters
Response length difference +45% longer

In 92% of search-triggered responses, ChatGPT fired exactly 2 sub-queries. Not 1, not 3 — two. It then cited a median of 6 sources and produced responses that were 45% longer than non-search responses.

This consistency means the search pipeline is predictable. Two queries, six citation slots, longer answer. If your content appears in those search results, you have a real shot at one of those six citation slots. If ChatGPT does not search, you have zero shots.

What the 24-Query Experiment Got Right

Not everything from the smaller experiment was wrong. Three findings held up at scale:

Brand familiarity is a binary signal. Unknown brands force search. Known brands get answered from training data. This was true at 24 queries and remained true at 400. If ChatGPT does not recognize your brand, it will search for you. We confirmed this pattern in a separate study of 182 brand queries.

Phrasing changes behavior. "What is HubSpot?" triggers no search. "Tell me about HubSpot" triggers a deep search. The distinction between a definition request and an open-ended research request still holds. "What is" signals confidence; "tell me about" signals uncertainty.

Recency words are irrelevant. "Best CRM tools right now" and "Best CRM tools" trigger search at the same rate. ChatGPT automatically injects time-specific context into its search queries. You do not need to stuff "2026" or "right now" into your content — the AI assumes you want current information and searches accordingly.

The Playbook: How to Use This Data

Target Discovery Queries, Not Informational Ones

Discovery queries trigger search 73% of the time. Informational queries trigger search 10% of the time. The gap is 7x.

If you are investing in content for AI visibility, prioritize "best X for Y" and "top X in 2026" queries over "what is X" queries. Discovery content is where AI citations actually happen. Informational content may still serve your traditional SEO strategy, but it will rarely surface in ChatGPT responses.

Build Category Pages, Not Just Brand Pages

"Best email marketing platform" triggers search 59% of the time. "What is Mailchimp" triggers search 5% of the time. Category pages are the gateway to AI citations because they match the query type that forces ChatGPT to search.

Build content for your product category, not just your product name. Different AI platforms handle category queries differently, but the pattern of category > brand holds across all of them.

If You Are a Smaller Brand, This Is Your Advantage

ChatGPT does not know you. That means it will search for you. Every query about your brand forces a fresh web lookup where you control what it finds.

Established brands have the opposite problem — ChatGPT thinks it knows them and often does not bother searching, which means it may be citing outdated training data. Your obscurity forces the AI to look. Make sure what it finds is authoritative, complete, and well-structured.

Create Comparison Content — But Be Strategic

Comparison pages still work, but only when at least one brand in the comparison is unfamiliar to ChatGPT. "Your Brand vs [Well-Known Competitor]" is the sweet spot — ChatGPT does not know your brand, so it searches, and your comparison page is exactly the kind of content it is looking for.

Avoid building comparison pages between two household names unless you have genuinely unique data or analysis. ChatGPT will answer "Salesforce vs HubSpot" from memory and never see your page.

Do Not Over-Invest in "What Is X" Content for AI Visibility

Only 10% of informational queries trigger search. The ROI for AI citation on this content type is poor. You may still want "what is X" pages for traditional SEO and top-of-funnel traffic, but do not count on them for AI visibility.

Keep your informational content fresh and well-dated so it serves your broader strategy, but allocate your AI optimization efforts toward discovery and review-seeking queries.

Limitations and Caveats

These results should be interpreted with the following constraints:

  • Query scope: This is brand and product queries specifically. Pure informational or educational queries ("how does photosynthesis work") may behave differently.
  • Vertical coverage: The 8 verticals tested are SaaS-heavy. Consumer electronics, health, travel, and finance may show different patterns.
  • API vs. UI divergence: We documented meaningful differences between API and UI behavior. Any research using the API alone (including our own earlier work) may not reflect actual user experience.
  • Single run per query: Each query was run once. We did not measure run-to-run consistency. Some queries may be near the decision boundary and could trigger search on repeated attempts.
  • Account type: All tests used standard paid accounts. Free accounts have search rate limits that could produce different results.
  • Six excluded citations: Six responses contained "DOM fallback" citations — links embedded from training data with no search trigger. These were excluded from citation analysis.

Methodology

391 valid queries from 400 submitted. Queries were sampled from Google Autocomplete suggestions and real Google "People Also Asked" data, then classified by intent (5 types) and vertical (8 categories). Scraping was performed using Playwright via the Chrome DevTools Protocol, capturing ChatGPT's SSE stream events in real time. Search trigger was determined by the presence of fan-out search queries in the stream — not by whether citations appeared in the response. This distinction matters because ChatGPT occasionally embeds links from training data without actually searching.

For the full academic treatment of query intent and AI citation behavior, see our research on query intent and AI citation.

The Bottom Line

ChatGPT's decision to search is not random. It follows a clear, intent-driven pattern:

  • 42% overall search rate for brand and product queries
  • Discovery queries: 73% — this is where citations happen
  • Review-seeking: 58% — strong citation potential
  • Validation: 44% — moderate
  • Comparison: 29% — only when lesser-known brands are involved
  • Informational: 10% — the AI already knows the answer

When ChatGPT does search, it fires 2 sub-queries and cites 6 sources. When it does not search, your content does not exist.

The AI's decision to search is the gatekeeper. Optimize for the queries where it does.