Query Intent, Not Google Rank: What Best Predicts AI Citation Behavior
Anthony Lee — AI+Automation
Preprint — February 2026 (v4) | Not yet peer-reviewed
Abstract
The rapid integration of AI chatbots into consumer search behavior has spawned a cottage industry of Generative Engine Optimization (GEO) advice, much of it built on untested assumptions about how AI platforms select sources for citation. Industry practitioners widely assert that Google ranking determines AI visibility, that community-consensus platforms like Reddit confer citation advantages, and that AI recommendations are too inconsistent to warrant optimization efforts. We tested these claims empirically across four major AI platforms—ChatGPT, Claude, Perplexity, and Gemini—using a multi-study design that combined large-scale query intent classification (n = 19,556 queries across 8 verticals), Google rank cross-referencing (120 queries with 360 Top-3 results), server-side fetch verification via Vercel middleware logging, and page-level technical analysis of 479 cited and non-cited pages. Our results challenge all three prevailing claims. First, query intent—not Google rank or domain authority—emerged as the strongest predictor of citation source type at the aggregate level, with intent distributions varying significantly by vertical (χ²(28) = 5,195, p < .001, Cramér’s V = 0.258), though formal predictive modeling showed that at the individual page level, technical page features (AUC = 0.594) outperformed intent (AUC = 0.462) for predicting citation status. Second, Google’s Top-3 organic results predicted AI citations poorly: ChatGPT matched only 7.8% of URLs, while Reddit—despite occupying 38.3% of Google Top-3 positions across our sample—received exactly zero AI citations from either platform’s API (binomial p = 3.43 × 10²³ for Perplexity). A companion study (Lee, 2026b) subsequently demonstrated that this zero-citation finding is access-channel dependent: the web UIs of the same platforms cite Reddit at rates of 17–44%, suggesting that API-based research may systematically underestimate Reddit’s role in AI-generated responses. Third, AI brand recommendations showed substantial within-platform consistency (ChatGPT mean Jaccard = 0.619, 95% CI [0.537, 0.701]), though cross-platform agreement was near-random (all-four-platform Jaccard = 0.036). We further discovered a previously unreported architectural divide: ChatGPT and Claude perform live page fetches during conversations, while Perplexity and Gemini rely exclusively on pre-built search indices—with divergent robots.txt compliance behavior between the fetching platforms. These findings suggest that effective GEO strategy requires intent-aware, platform-specific optimization rather than the one-size-fits-all approach currently advocated by industry practitioners.
Keywords
Generative Engine Optimization, GEO, AI citation behavior, AI search, ChatGPT, Claude, Perplexity, Gemini, query intent, brand recommendations, live page fetch, robots.txt compliance
Citation
Lee, A. (2026). Query intent, not Google rank: What best predicts AI citation behavior. Preprint v4, AI+Automation.