← Back to Blog

AI TOOLS

Why AI Is Not Citing Your Website and How to Fix It: Platform-by-Platform Troubleshooting

2026-04-06

Why AI Is Not Citing Your Website and How to Fix It: Platform-by-Platform Troubleshooting

If AI search is not citing your website, the problem is almost never content quality. It is almost always a pipeline failure: something specific and fixable is preventing a platform from discovering, fetching, or selecting your pages. This guide walks you through every common failure point, platform by platform.

You rank on Google. Your content is thorough. Your domain has been around for years. But when someone asks ChatGPT, Perplexity, or Google AI Mode a question about your topic, your site is nowhere in the response. Your competitors show up instead.

This is not a fringe complaint. It is the single most common frustration we hear from site owners who understand traditional SEO but have never audited their visibility across AI search platforms. The reason it keeps happening is structural: each AI platform has a completely different citation pipeline, and each pipeline has its own failure modes.

We analyzed 19,556 queries across 8 verticals and crawled 4,658 pages across 3,251 websites to map how AI platforms decide what to cite (Lee, 2026a). A follow-up study tested 10,293 pages across 250 queries on 3 AI platforms, controlling for Google rank position (Lee, 2026c). Together, these datasets reveal that most citation failures trace back to a small number of diagnosable, fixable causes.

The Bottom Line: AI not citing your website is a diagnostic problem, not a quality problem. Work through this guide platform by platform. Most sites have a single root cause, and fixing it unlocks visibility across multiple AI platforms at once.

🚦 STEP 0: UNDERSTAND WHEN AI ACTUALLY SEARCHES THE WEB

Before troubleshooting, you need to know that AI platforms do not always search the web. ChatGPT only triggers a web search for about 42% of queries in its web UI. The other 58% are answered entirely from training data. If your query type rarely triggers search, the issue may be that the AI never looked for sources at all.

Web search trigger rates vary dramatically by query intent (Lee, 2026a):

Query Intent Web Search Trigger Rate Example
Discovery ("best X for Y") ~73% "best CRM for small teams"
Review-seeking ("X reviews") ~70% "Notion reviews 2026"
Comparison ("X vs Y") ~65% "Asana vs Monday"
Validation ("is X good") ~40% "is Webflow good for SEO"
Informational ("what is X") ~10% "what is project management"

If your target queries are informational, the AI may never search at all, so Steps 1 through 6 below would not apply. Focus your troubleshooting on queries with high trigger rates: discovery, review, and comparison intents.

🔎 COMMON CAUSES AT A GLANCE

Here is the complete diagnostic table covering all platforms. Each row represents a failure point, how to detect it, and what to do about it.

# Failure Point Platforms Affected How to Detect Fix
1 Page not indexed by Bing ChatGPT site:yourdomain.com in Bing; Bing Webmaster Tools Submit XML sitemap to Bing Webmaster Tools
2 PerplexityBot blocked or never crawled Perplexity Check robots.txt; check server logs for PerplexityBot Allow PerplexityBot in robots.txt; add sitemap
3 Not in Google index Google AI Mode Google Search Console Standard Google indexing practices
4 robots.txt blocking AI bots All platforms Review robots.txt for Disallow rules Explicitly allow each AI bot
5 Content is client-side rendered ChatGPT, Perplexity, Claude curl -s yourpage and check for content Implement SSR or SSG
6 Query intent mismatch All platforms Compare your format to what gets cited Align content with discovery/comparison intent
7 Stale content (60+ days) Perplexity (strongest), others Check last update date Refresh on a 60-to-90-day cycle
8 Missing date schema Perplexity (strongest), others View source for datePublished/dateModified Add JSON-LD date fields and visible dates
9 Weak page-level signals All platforms Audit canonical tags, schema, link ratios Fix specific predictors (see cross-platform section)

Now let us work through each platform.

💬 CHATGPT-SPECIFIC TROUBLESHOOTING

ChatGPT does not have its own web index. It uses a two-layer retrieval system: Bing's API for URL discovery and a live fetch agent (ChatGPT-User) for real-time page retrieval.

Check 1: Is Bing Indexing Your Pages?

If Bing has not indexed your page, ChatGPT will never find it through standard queries, no matter how good your content is. Google indexing a page does not mean Bing has indexed it. They are completely separate systems.

Our research found that ChatGPT's top-3 Bing URLs matched actual citations only 6.8% to 7.8% of the time at the URL level, but domain-level overlap was 28.7% to 49.6% (Lee, 2026a). Bing is the gatekeeper that gets you into the candidate pool. Once in the pool, ChatGPT makes its own selection.

How to check:

  1. Go to Bing Webmaster Tools and verify your site
  2. Use the URL Inspection tool for your key pages
  3. Search site:yourdomain.com directly in Bing
  4. Compare results against your sitemap to find coverage gaps

How to fix:

  • Submit your XML sitemap through Bing Webmaster Tools
  • Use the Submit URL feature for high-priority pages
  • Verify that Bing is not hitting crawl errors on your server
  • Use Bing's IndexNow protocol to accelerate indexing of new or updated pages

Check 2: Understand the Three OpenAI Bots

OpenAI operates three separate bots, and each behaves differently with robots.txt:

Bot Purpose Respects robots.txt What Blocking Does
GPTBot Training data collection Yes Prevents use in future model training
OAI-SearchBot ChatGPT Search index Yes Reduces visibility in ChatGPT Search results
ChatGPT-User Live page fetching during conversations No (since Dec 2025) Cannot be blocked via robots.txt

ChatGPT-User ignores robots.txt entirely as of December 2025. OpenAI reclassified it as "a technical extension of the user" rather than an autonomous crawler. So blocking it in robots.txt does nothing. But if you block OAI-SearchBot, you reduce your presence in ChatGPT's search index, which directly reduces citation opportunities.

For the full ChatGPT optimization playbook, see our ChatGPT SEO Optimization Guide.

The Bottom Line: Bing Webmaster Tools is no longer optional. It is the front door to ChatGPT visibility. If you have been ignoring Bing because your traffic comes from Google, you have been ignoring the gatekeeper for every ChatGPT citation.

🔮 PERPLEXITY-SPECIFIC TROUBLESHOOTING

Perplexity is fundamentally different from ChatGPT. It does not search the web live. It retrieves answers from a pre-built index that its background crawler, PerplexityBot, compiled hours or days earlier. If PerplexityBot has not crawled your page before the query happens, you cannot be cited.

Check 1: Is PerplexityBot Blocked?

PerplexityBot respects robots.txt. If your robots.txt blocks it, your content is invisible to Perplexity. Not deprioritized. Invisible. Many sites added broad AI bot blocks in 2024 and 2025 without realizing they were blocking search citation, not just training data.

Cui et al. (2025) analyzed 582,281 robots.txt files and found AI-specific blocks increased significantly between 2023 and 2024, with many sites inadvertently blocking search-index crawlers alongside training crawlers (Cui et al., 2025).

How to check: Open https://yourdomain.com/robots.txt and look for:

User-agent: PerplexityBot
Disallow: /

Or a wildcard block:

User-agent: *
Disallow: /

The fix: Allow PerplexityBot explicitly:

User-agent: PerplexityBot
Allow: /

Check 2: Is Your Sitemap Complete and Discoverable?

PerplexityBot uses XML sitemaps as a primary discovery mechanism. Unlike Googlebot, which discovers pages through decades of link-following, PerplexityBot is a newer crawler building an index from scratch. Your sitemap is how it finds pages it would otherwise miss.

Sitemap Problem Impact Fix
No sitemap exists Slower discovery; PerplexityBot relies on link-following only Generate and submit a sitemap
Sitemap not in robots.txt PerplexityBot may never find it Add Sitemap: directive to robots.txt
Key pages missing from sitemap Those pages may never be crawled Include all citation-worthy pages
Stale <lastmod> dates PerplexityBot deprioritizes pages it thinks have not changed Update <lastmod> when content changes
All <lastmod> dates identical Looks like auto-generated noise Set accurate per-page dates

Check 3: Is Your Content Fresh Enough?

Perplexity exhibits the strongest freshness bias of any major AI search platform. Our data shows Perplexity cites sources that are 3.3x fresher than Google's top results for medium-velocity topics (Lee, 2026a):

Topic Velocity Perplexity Median Source Age Google Median Source Age Freshness Gap
High (news, finance) 1.8 days 28.6 days 16x fresher
Medium (SaaS, tech) 32.5 days 108.2 days 3.3x fresher
Low (evergreen, education) 84.1 days 1,089.7 days 13x fresher

If your content is older than 60 to 90 days on a medium-velocity topic, it is actively losing ground to fresher competitors. Implement a 60-to-90-day refresh cycle: days 1 to 30 are your peak freshness window, days 30 to 60 freshness decay begins, and days 60 to 90 a substantive refresh is needed.

Check 4: Are Your Date Signals Parseable?

Freshness only works as a signal if Perplexity can read your dates. Missing date metadata means your content is treated as undated, and undated content underperforms in a freshness-biased system.

PerplexityBot looks for dates in this priority order:

  1. JSON-LD schema: datePublished and dateModified in Article, BlogPosting, or WebPage types
  2. Open Graph meta tags: article:published_time and article:modified_time
  3. Visible on-page date: A human-readable "Last updated" date near the top of content

You need all three, and they should agree with each other. Conflicting dates between schema and visible text reduce confidence in the signal.

Check 5: Is PerplexityBot Actually Crawling You?

Even with clean configuration, newer or previously blocked sites may not be in PerplexityBot's crawl queue. Check your server logs for the PerplexityBot user agent. If you see zero visits, your domain may not have been discovered yet. Ensure your sitemap is in robots.txt and build inbound links from sites PerplexityBot already crawls. For monitoring tools and setup instructions, see our AI Bot Tracking Guide.

Pages with FAQPage schema receive approximately 2x more recrawl visits from AI bots compared to standard blog posts, because each FAQ contains multiple citable question-answer pairs.

For the complete Perplexity playbook, see our Perplexity Optimization Guide.

The Bottom Line: Perplexity citation failures are index failures. If PerplexityBot cannot access your pages, has not discovered them, or finds stale content with no date signals, no amount of content quality will earn citations.

🌐 GOOGLE AI MODE-SPECIFIC TROUBLESHOOTING

Google AI Mode uses the same infrastructure as traditional Google Search. This makes it both the easiest and the hardest AI platform to troubleshoot.

Why Google AI Mode Is Different

Google AI Mode pulls from the same index that powers Google Search. Googlebot crawls your pages, renders JavaScript (using headless Chrome), and indexes the rendered content. This means:

  • Client-side rendered pages work. Unlike every other AI platform, Google AI Mode can see JavaScript-rendered content because Googlebot runs a full rendering pipeline.
  • If you are indexed in Google Search, you are available to Google AI Mode. No separate submission process is required.
  • Standard Google Search Console practices apply. Submit your sitemap, ensure Googlebot is not blocked, and monitor for indexing issues.

Why Google AI Mode Might Still Skip You

Being indexed is necessary but not sufficient. AI Mode selects citations based on content features, not just ranking signals. Our position-controlled study found 6 page features that predicted citation across all Google position bands (Lee, 2026c). A page ranked #1 may still be skipped if it lacks comparison structure, FAQ sections, or structured data.

Google AI Mode Factor What to Check Why It Matters
Google index status Google Search Console Required: no index = no AI Mode
Content format Comparison tables, FAQ sections, structured answers AI Mode prefers structured, scannable content
Schema markup Product (3.09x citation odds), Review (2.24x) Strongest schema types for AI citation
Domain breadth Number of distinct queries your domain ranks for Sites ranking for more queries get cited more often

The Bottom Line: If your pages are in Google's index, the technical barriers are already cleared. Google AI Mode troubleshooting is about content format and structure, not crawl access. For the full guide, see ChatGPT SEO Optimization Guide (which covers cross-platform page features).

🔧 CROSS-PLATFORM CHECKS

These issues affect visibility on every AI platform simultaneously. Fix them once, and you improve your position everywhere.

Server-Side Rendering (The Silent Killer)

Four of the five major AI platforms cannot execute JavaScript. If your site uses client-side rendering (React, Vue, Angular without SSR), those platforms see an empty HTML shell.

Rendering Method Google AI Mode ChatGPT / Perplexity / Claude
Server-side rendered (SSR) Full content visible Full content visible
Static site generation (SSG) Full content visible Full content visible
Client-side rendered (CSR) Full content visible (Googlebot renders JS) Empty shell
Hybrid (SSR + hydration) Full content visible Full content visible

How to test: Run curl -s https://yoursite.com/page and check if the body contains your actual content. If the HTML is mostly script tags and empty divs, AI bots (except Googlebot) cannot see your page.

Framework Recommended Fix
React (Create React App) Migrate to Next.js with SSR or SSG
Vue (standard SPA) Migrate to Nuxt.js with SSR
Angular Use Angular Universal for SSR
WordPress Already SSR by default (check for JS-dependent themes)
Custom SPA Implement prerendering at the server level

robots.txt Configuration (The Universal Gatekeeper)

A single misconfigured robots.txt can block every AI crawler at once. The recommended configuration explicitly allows each AI bot:

User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Google-Extended
Allow: /

Sitemap: https://yoursite.com/sitemap.xml

If you want citation visibility but not training data contribution, allow OAI-SearchBot and PerplexityBot while blocking GPTBot. For the complete robots.txt reference, see our robots.txt for AI Bots guide. For crawler monitoring and log analysis, see our AI Bot Tracking Guide.

Schema Markup (Types Matter More Than Presence)

Generic schema presence is not a statistically significant predictor of AI citation (p = 0.78). But specific schema types are powerful signals:

Schema Type Citation Odds Ratio What It Signals
Product 3.09x E-commerce, software, tools
Review 2.24x Evaluations, assessments
FAQPage ~2x recrawl rate Structured Q&A pairs
Article / BlogPosting Baseline Standard content
No schema 0.6x Missing structured signals

The lesson: do not just add schema for the sake of having it. Choose the type that accurately describes your content and complete its properties thoroughly.

The 7 Page-Level Predictors

Lee (2026a) identified 7 statistically significant predictors that determine which pages win citations within a given intent pool:

Predictor What Matters Direction
Self-referencing canonical <link rel="canonical"> pointing to itself Present = 1.92x odds
Content-to-HTML ratio Proportion of text vs. boilerplate HTML Higher = better (0.086 vs. 0.065)
External vs. internal link ratio Balance of outbound links External-heavy = 0.47x penalty
Internal link count Navigation links (menus, sidebars, breadcrumbs) Fewer in-content links correlate with citation
Schema attribute count Completeness of schema properties More complete = 1.21x
Word count Total content length Cited pages median ~1,800 words, not-cited ~2,100
Schema type Product, Review, FAQPage Type-dependent (see table above)

Two findings stand out. First, a missing self-referencing canonical tag is the easiest fix at 1.92x odds improvement. Second, heavy external linking is the strongest negative signal: pages with many outbound links and few internal navigation links get cited roughly half as often.

The Bottom Line: Run through each predictor against your key pages. The most common fixable issues are missing canonical tags, thin navigation architecture, and external-heavy link profiles. Notably, popups, author bios, page load time, and file size showed no significant effect on citation.

📊 PLATFORM COMPARISON: HOW EACH AI ENGINE FINDS SOURCES

Feature ChatGPT Perplexity Google AI Mode Claude
Index source Bing Own (PerplexityBot) Google (Googlebot) Brave Search
Live page fetch Yes (ChatGPT-User) No No (cached index) Yes (Claude-User)
JavaScript rendering None None Full (headless Chrome) None
robots.txt compliance GPTBot/OAI-SearchBot: Yes. ChatGPT-User: No Yes Yes ClaudeBot: Yes. Claude-User: Yes
Freshness bias Moderate Strong (3.3x vs. Google) Moderate Moderate
Submission tool Bing Webmaster Tools None Google Search Console None
Google rank correlation Near zero (rho = -0.02 to 0.11) Near zero Inherits Google signals Zero (flat across positions 1-20)

❓ FREQUENTLY ASKED QUESTIONS

Why does ChatGPT show my competitors but not me?

The two most common reasons are query intent mismatch and a Bing indexation gap. ChatGPT classifies queries by intent (discovery, comparison, informational) and cites content that matches the format. If your competitor has a comparison page with tables and you have an educational article, their content matches the intent pool while yours does not. Run through the ChatGPT-specific checks above, starting with Bing indexation.

Does my Google ranking help me appear in any AI platform?

Not directly. Lee (2026a) found near-zero correlation between Google rank and AI citation (Spearman rho = -0.02 to 0.11, all non-significant across 19,556 queries). Google ranking indirectly helps with Google AI Mode because they share the same index. But for ChatGPT (which uses Bing), Perplexity (which uses its own index), and Claude (which uses Brave Search), Google ranking has no measured effect.

Perplexity used to cite me but stopped. What happened?

The most likely cause is content staleness. Perplexity exhibits a 3.3x freshness bias compared to Google for medium-velocity topics. If your content has not been updated in 60+ days while competitors have published fresher alternatives, Perplexity will shift citations to the newer sources. Check your dateModified schema and refresh your content with substantive updates.

How long does it take for fixes to show up across AI platforms?

It varies by platform and fix type. ChatGPT live fetches can reflect content changes within minutes, but Bing indexation takes days to weeks. Perplexity depends on PerplexityBot's crawl schedule, typically 1 to 7 days after a robots.txt fix. Google AI Mode inherits Google's indexing timeline. Robots.txt changes take effect on the next crawl visit. For a complete timeline, see our AI Bot Tracking Guide.

Should I create separate content for each AI platform?

No. The page-level features that predict citation (canonical tags, schema markup, balanced link ratios, SSR) benefit all platforms equally. The one area where strategies diverge is freshness: Perplexity rewards frequent updates more aggressively than other platforms. Focus on creating well-structured, regularly updated content that serves all platforms, then use platform-specific submission tools (Bing Webmaster Tools for ChatGPT, Google Search Console for Google AI Mode) to ensure discovery.

Can I pay for placement in AI search results?

No. There is no paid placement, no submission form, and no direct way to guarantee citation on any AI platform. What you can do is remove the technical barriers that prevent citation, align your content with the right query intents, and strengthen the page-level signals that predict selection. For a hands-on check of your pages against these factors, try our AI Visibility Quick Check. For professional optimization, see our AI SEO services.

📚 REFERENCES

  • Lee, A. (2026a). "Query Intent, Not Google Rank: What Best Predicts AI Citation Behavior." Preprint v5, A.I. Plus Automation. DOI: 10.5281/zenodo.18653093. 19,556 queries, 4,658 pages, 3,251 websites, 4 AI platforms.

  • Lee, A. (2026c). "I Rank on Page 1: What Gets Me Cited by AI? Position-Controlled Analysis of Page-Level and Domain-Level Predictors of AI Search Citation." A.I. Plus Automation. Paper. Dataset DOI: 10.5281/zenodo.19398158. 10,293 pages, 66 features, 250 queries, 3 AI platforms.

  • Aggarwal, P., Murahari, V., Rajpurohit, T., Kalyan, A., Narasimhan, K., & Deshpande, A. (2024). "GEO: Generative Engine Optimization." Proceedings of KDD 2024. DOI: 10.48550/arXiv.2311.09735. Note: the Princeton lab's 40% visibility boost has not replicated on production AI platforms; see our replication analysis.

  • Cui, H., Wang, Z., Saad, A., & Li, A. (2025). "The Web's Gatekeepers: A Systematic Analysis of LLM Bots and robots.txt Compliance." ACM Web Conference 2025. DOI: 10.1145/3719027.3765063.

  • OpenAI. (2025). "ChatGPT crawler documentation update." OpenAI Platform Docs. December 2025.