How AI Platforms Search
Fan-Out Query Behavior Across Intent Types, Verticals, and Platforms
Anthony Lee — AI+Automation
Preprint v1.2 — April 2026 | Not yet peer-reviewed
Key Findings
Two-Layer Retrieval Model
AI search is two decisions, not one. Layer 1 (search or not) is 91.7% deterministic. Layer 2 (what to search) produces entirely different query strings each time (98% zero overlap) but the structural type of search is 65% stable. Optimize for types, not strings.
Intent Drives Fan-Out Composition
Discovery queries trigger 3.3x the entity injection rate of informational queries (X2=299.6, p<0.001, V=0.24). When users shop, AI injects brand names from training data. When users learn, AI compresses to keywords. Intent, not vertical, determines retrieval strategy.
Platform Retrieval Personalities
ChatGPT injects entities (32% of fan-outs). Gemini explores broadly (27% expansion + 21% tangential). Perplexity seeks evidence (21%) and compresses to keywords (19%). Each platform needs a different optimization approach (X2=386.9, p<0.001, V=0.38).
Model-Tier Search Behavior
ChatGPT's flagship model (gpt-5.4) searches the web on only 29% of queries. Smaller models (gpt-5.4-mini, gpt-5.4-nano) search 100%. Bigger models answer from memory. Content not in training data may be invisible to the most capable model.
Format Sensitivity
Situation-first query phrasing ("I just got a data breach notification...") produces significantly different fan-out distributions than standard phrasing ("best password manager 2026") with V=0.35, p<0.001. Current monitoring tools that only track keyword-style queries miss the retrieval paths conversational prompts trigger.
Read Full Abstract
When users submit queries to AI search platforms, the platforms do not pass the user's text to web search verbatim. They decompose each prompt into multiple internal "fan-out queries" -- the actual strings sent to retrieval engines. These fan-out queries determine which pages get fetched, which enter the AI's context window, and which get cited in the response. This study classifies 1,323 fan-out queries generated by 540 parent queries across three AI platforms (ChatGPT, Gemini, Perplexity), ten commercial verticals, and five intent types. Seven findings emerge. First, user intent is a significant predictor of fan-out composition (X2=299.6, p<0.001, V=0.24): discovery queries trigger 3.3x the entity injection rate of informational queries. Second, platforms exhibit distinct retrieval personalities -- ChatGPT injects entities from training data on 32% of fan-outs, Gemini casts a wide net with 27% expansion queries, and Perplexity leads in evidence-seeking at 21%. Third, ChatGPT's search trigger rate varies dramatically by model tier: gpt-5.4 searches on only 29% of queries while gpt-5.4-nano searches on 100%. Fourth, platform-intent interaction effects explain fan-out variation better than either factor alone (two-way AIC=192 vs main-effects AIC=937). Fifth, situation-first query phrasing produces significantly different fan-out distributions than standard phrasing (V=0.35, p<0.001). Sixth, no significant vertical effect was detected at this sample size (H=6.26, p=0.71). Seventh, replicate analysis reveals that the search trigger decision is highly deterministic (91.7% agreement) while fan-out strings are stochastic (98% zero overlap), but the structural type of fan-out is moderately stable (65% top-type agreement). These findings establish that AI search operates a two-layer retrieval system: a model-confidence layer that decides whether to search at all, and a query-decomposition layer that determines what to search for.
Keywords
Generative Engine Optimization, GEO, query fan-out, Large Language Models, search retrieval, agentic search, ChatGPT, Perplexity, Gemini, Google AI Mode
Citation
Lee, A. (2026). How AI platforms search: Fan-out query behavior across intent types, verticals, and platforms. Preprint v1.2, AI+Automation.