← All Research

How AI Platforms Search

Fan-Out Query Behavior Across Intent Types, Verticals, and Platforms

Author

Anthony Lee — AI+Automation

Status

Preprint v1.2 — April 2026 | Not yet peer-reviewed

Download PDF

Key Findings

Two-Layer Retrieval Model

AI search is two decisions, not one. Layer 1 (search or not) is 91.7% deterministic. Layer 2 (what to search) produces entirely different query strings each time (98% zero overlap) but the structural type of search is 65% stable. Optimize for types, not strings.

Intent Drives Fan-Out Composition

Discovery queries trigger 3.3x the entity injection rate of informational queries (X2=299.6, p<0.001, V=0.24). When users shop, AI injects brand names from training data. When users learn, AI compresses to keywords. Intent, not vertical, determines retrieval strategy.

Platform Retrieval Personalities

ChatGPT injects entities (32% of fan-outs). Gemini explores broadly (27% expansion + 21% tangential). Perplexity seeks evidence (21%) and compresses to keywords (19%). Each platform needs a different optimization approach (X2=386.9, p<0.001, V=0.38).

Model-Tier Search Behavior

ChatGPT's flagship model (gpt-5.4) searches the web on only 29% of queries. Smaller models (gpt-5.4-mini, gpt-5.4-nano) search 100%. Bigger models answer from memory. Content not in training data may be invisible to the most capable model.

Format Sensitivity

Situation-first query phrasing ("I just got a data breach notification...") produces significantly different fan-out distributions than standard phrasing ("best password manager 2026") with V=0.35, p<0.001. Current monitoring tools that only track keyword-style queries miss the retrieval paths conversational prompts trigger.

Read Full Abstract

When users submit queries to AI search platforms, the platforms do not pass the user's text to web search verbatim. They decompose each prompt into multiple internal "fan-out queries" -- the actual strings sent to retrieval engines. These fan-out queries determine which pages get fetched, which enter the AI's context window, and which get cited in the response. This study classifies 1,323 fan-out queries generated by 540 parent queries across three AI platforms (ChatGPT, Gemini, Perplexity), ten commercial verticals, and five intent types. Seven findings emerge. First, user intent is a significant predictor of fan-out composition (X2=299.6, p<0.001, V=0.24): discovery queries trigger 3.3x the entity injection rate of informational queries. Second, platforms exhibit distinct retrieval personalities -- ChatGPT injects entities from training data on 32% of fan-outs, Gemini casts a wide net with 27% expansion queries, and Perplexity leads in evidence-seeking at 21%. Third, ChatGPT's search trigger rate varies dramatically by model tier: gpt-5.4 searches on only 29% of queries while gpt-5.4-nano searches on 100%. Fourth, platform-intent interaction effects explain fan-out variation better than either factor alone (two-way AIC=192 vs main-effects AIC=937). Fifth, situation-first query phrasing produces significantly different fan-out distributions than standard phrasing (V=0.35, p<0.001). Sixth, no significant vertical effect was detected at this sample size (H=6.26, p=0.71). Seventh, replicate analysis reveals that the search trigger decision is highly deterministic (91.7% agreement) while fan-out strings are stochastic (98% zero overlap), but the structural type of fan-out is moderately stable (65% top-type agreement). These findings establish that AI search operates a two-layer retrieval system: a model-confidence layer that decides whether to search at all, and a query-decomposition layer that determines what to search for.

Keywords

Generative Engine Optimization, GEO, query fan-out, Large Language Models, search retrieval, agentic search, ChatGPT, Perplexity, Gemini, Google AI Mode

Citation

Lee, A. (2026). How AI platforms search: Fan-out query behavior across intent types, verticals, and platforms. Preprint v1.2, AI+Automation.

DOI: https://doi.org/10.5281/zenodo.19555393

ORCID ORCID: 0009-0002-4815-6373 aiXiv: aixiv.260413.000006 Zenodo: Replication Data