โ† Back to Blog

AI SEO EXPERIMENTS

Is GEO Just Repackaged SEO? We Tested 4,375 Pages to Find Out

2026-03-27

Is GEO Just Repackaged SEO? We Tested 4,375 Pages to Find Out

Correction (2026-03-30): Our follow-up study, Experiment K (n=4,350 pages, 2,510 domains), proved that the "page speed = 65.8% of model importance" finding below was a domain identity proxy. The Gradient Boosting model was fingerprinting publisher domains by their CDN speed signatures, not learning that fast pages get cited. Within the same domain, cited pages are actually slower (r=-0.221, p<0.000001). Domain features alone predict at AUC=0.975, while all page features combined reach only 0.773. Adding domain identity drops load time importance by 92.4%. The only valid speed threshold: pages above 5 seconds see citation rates drop from ~60% to 35%. Below that, speed has no effect. The GEO vs SEO comparison and content quality findings in this article remain valid. The page speed conclusions do not.

Google's John Mueller called GEO a likely scam. Agencies charge $50K for it. We tested 4,375 pages across ChatGPT, Claude, Perplexity, and Google AI Mode to find out who's right. The answer is more nuanced than either side wants to hear.

This is Part 2 of our replication series. In Part 1, we tested the Princeton GEO paper's three specific claims (statistics, citations, quotations). Here, we ask the bigger question: is GEO a real discipline, or just traditional SEO with a new label and a higher price tag?

๐Ÿ”ข KEY NUMBERS AT A GLANCE

Metric Value What It Means
Total pages tested 4,375 2,520 cited, 1,855 not-cited across 4 platforms
Traditional SEO score direction Wrong Higher SEO scores predict LOWER AI citation (r = +0.041)
GEO-specific score direction Correct Higher GEO scores predict HIGHER AI citation (r = -0.178)
GEO marginal value +22.2 percentage points Among SEO-optimized pages, adding GEO features lifts citation rate from 42.8% to 65.0%
Combined model improvement +0.35% AUC Real but tiny. Page speed alone is 65.8% of the model
Schema markup importance 0.12% Nearly worthless for AI citation
Author attribution importance 0.36% Nearly worthless for AI citation
Top GEO feature Technical terminology (9.9%) Second only to page speed in overall importance

The Bottom Line: GEO is not a scam. The features marketed as "GEO" carry real predictive signal (+22 percentage points). But most of what agencies actually sell as GEO (schema optimization, E-E-A-T signals, structured data) is repackaged SEO that doesn't work for AI citation. The GEO features that do work (technical depth, readability, statistics) are just good writing, rebranded.

๐Ÿงช STUDY DESIGN

The Debate

Google's John Mueller warned: "The higher the urgency and the stronger the push of new acronyms, the more likely it's a scam." TheAdSpend documented a Toronto e-commerce company that paid $50,000 for GEO consulting with zero AI traffic after six months. Critics argue GEO/AEO/LLMO tactics are fundamentally the same as SEO best practices since 2011.

On the other side, agencies selling GEO services argue that AI citation requires distinct optimizations beyond traditional SEO: statistics-rich content, authoritative tone, structured formatting for extractability, and readability optimization.

We designed an experiment to settle this empirically.

Dataset

4,375 pages merged from four data sources:

Source Pages Content
Princeton Phase 1 430 Perplexity + ChatGPT cited and not-cited
Princeton Phase 2 913 Google AI + ChatGPT cited and not-cited (expanded dataset recrawl)
VPS database fresh export 1,862 All 4 platforms cited (March 16-26, 2026)
Not-cited expansion crawl 1,170 Additional not-cited pages from expanded dataset

Final composition: 2,520 cited vs 1,855 not-cited (1.36:1 ratio, near-balanced). All pages had full visible text extracted via Playwright headless Chromium. User-generated content platforms excluded.

Two Composite Scores

We scored every page on two independent composite scores. Each component feature was normalized to 0-1 and averaged.

Traditional SEO Score (7 features) -- things every SEO guide since 2012 recommends:

Feature What It Measures
Heading score H2 count (content structure)
Internal links Internal link count (site architecture)
Page speed Load time, inverted so faster is better (Core Web Vitals)
Schema markup Has any schema markup (structured data)
Content-to-HTML ratio How much of the page is actual content vs code
Word count Content depth (capped at 5,000 words)
Author attribution Has an author bio (E-E-A-T signal)

GEO-Specific Score (8 features) -- things marketed specifically as GEO/AEO/LLMO optimizations:

Feature What It Measures
Statistics density Numbers, percentages, data points per 1k words
Citation density Source attributions per 1k words
Quotation density Quoted text per 1k words
List formatting Bullet and numbered list items per 1k words
Authoritative tone Authority language markers per 1k words
Technical terminology Domain-specific vocabulary per 1k words
Heading density H2+H3 headings per 1k words (extractability)
Readability Flesch-Kincaid grade, inverted so simpler is better

The correlation between the two scores is only 0.133. They measure different things.

Statistical Methods

  • Mann-Whitney U for each composite score vs citation status
  • 2x2 quadrant analysis: High/Low SEO crossed with High/Low GEO
  • Nested model comparison using Logistic Regression, Random Forest, and Gradient Boosting (5-fold cross-validation, balanced at 1,855 vs 1,855)
  • Likelihood ratio test: does adding GEO features to an SEO-only model significantly improve fit?
  • Feature importance ranking across all 15 features
  • Stability verification across 10 random balanced downsamples

๐Ÿ“Š RESULTS: SEO vs GEO

The Headline Finding

Score Cited Median Not-Cited Median r p Direction
SEO score 0.354 0.379 +0.041 0.020 Wrong: not-cited pages score higher
GEO score 0.134 0.128 -0.178 < 0.0001 Correct: cited pages score higher

Pages that look well-optimized by traditional SEO standards are less likely to be cited by AI than pages that don't. The GEO-specific score is the one that predicts citation.

The Quadrant That Changes Everything

We split all 4,375 pages into four groups based on whether they scored above or below median on each composite score:

Quadrant n Citation Rate
Low SEO, Low GEO 1,134 57.1%
High SEO, Low GEO 1,053 42.8%
Low SEO, High GEO 1,053 65.0%
High SEO, High GEO 1,135 65.0%

The most important row is the second one. Pages that follow traditional SEO best practices but lack GEO-specific features have the lowest citation rate of any group (42.8%). These are the pages with schema markup, author bios, strong internal linking, and keyword-optimized headings, but without substantive data, technical depth, or readable prose. Traditional SEO optimization without content substance actively hurts your chances with AI.

The punchline: Low SEO / High GEO pages (65.0%) perform identically to High SEO / High GEO pages (65.0%). GEO features alone are sufficient. Adding SEO optimization on top of good content adds nothing.

Among pages that already score high on SEO, adding GEO features increases the citation rate from 42.8% to 65.0%, a 22.2 percentage point improvement. That is statistically significant (p < 0.0001) and practically meaningful.

๐Ÿ‹๏ธ MODEL COMPARISON

We trained three different machine learning models on balanced data (1,855 vs 1,855) to see how well each feature set predicts AI citation.

Model SEO Only GEO Only SEO + GEO
Logistic Regression 0.604 0.654 0.682
Random Forest 0.813 0.725 0.816
Gradient Boosting 0.941 0.700 0.935

Two things jump out:

  1. GEO beats SEO on linear models. When every feature gets equal treatment (Logistic Regression), GEO features carry more signal than SEO features (0.654 vs 0.604).

  2. SEO dominates on tree models because of one feature: page speed. The Gradient Boosting model achieves 0.941 with SEO features alone because page speed is a nonlinear powerhouse. It's so dominant that adding GEO features actually introduces noise and drops performance to 0.935. (Correction: Experiment K proved this dominance was domain fingerprinting. The GBM was using speed signatures to identify high-citation domains, not learning that fast pages get cited.)

The likelihood ratio test (Logistic Regression) confirms GEO carries real independent signal: chi-squared(8) = 268.3, p < 0.0001. But the practical improvement in tree-based models is negligible.

๐Ÿ“‹ WHAT ACTUALLY MATTERS: FEATURE IMPORTANCE

Here's every feature ranked by importance in the Gradient Boosting model:

Rank Feature Importance Category
1 Page speed 65.8% SEO
2 Technical terminology 9.9% GEO
3 Content-to-HTML ratio 4.3% SEO
4 Readability 2.6% GEO
5 Statistics density 2.4% GEO
6 Heading density 2.3% GEO
7 Word count 2.2% SEO
8 Quotation density 2.2% GEO
9 Internal links 2.2% SEO
10 List formatting 1.8% GEO
11 Heading score 1.6% SEO
12 Authoritative tone 1.3% GEO
13 Citation density 0.9% GEO
14 Author attribution 0.36% SEO
15 Schema markup 0.12% SEO

Total importance: SEO features = 76.6%, GEO features = 23.4%.

But that 76.6% is almost entirely page speed. Remove speed and the remaining SEO features contribute only 10.8%, less than GEO's 23.4%.

The two most-recommended traditional SEO tactics, author attribution and schema markup, are dead last. Combined they account for 0.48% of importance. These are features that SEO guides have pushed since 2012 and that GEO agencies carry over into their AI optimization packages. For AI citation, they are almost completely irrelevant.

The GEO features that actually matter are content-quality proxies: technical terminology (is this page written by someone who knows the subject?), readability (can a reader actually understand it?), statistics density (does it contain real data?), and heading structure (is it organized clearly?).

๐Ÿ˜ THE PAGE SPEED ELEPHANT IN THE ROOM

Correction: This section's hypothesis was directionally correct but undersold. Our follow-up Experiment K (n=4,350 pages, 2,510 domains) confirmed what we suspected here: speed was proxying for domain identity. But the effect is even stronger than we anticipated. Domain features alone predict at AUC=0.975, and adding domain identity to the model drops load time importance by 92.4%. The "65.8%" was not just a proxy for site quality; it was the GBM directly fingerprinting domains by their CDN speed signatures. See the correction at the top of this article for full details.

Page speed accounts for 65.8% of feature importance. That number warrants scrutiny.

At this scale, page speed is not measuring "how fast your server responds." It is measuring "is this a well-built site run by a real organization?" Sites with fast load times tend to be professionally built, well-funded, and well-maintained. Sites with slow load times tend to be WordPress blogs with heavy plugins, ad-loaded content farms, or poorly maintained legacy pages.

Speed is a proxy for domain quality and institutional investment, something closer to what the SEO industry calls "domain authority." We are not claiming that shaving 200ms off your load time will get you cited by AI. We are observing that the kind of site that loads fast is also the kind of site AI platforms choose to cite. The causal mechanism is site quality, not milliseconds.

This actually strengthens the overall finding: the single most important factor for AI citation is "be a well-built, well-resourced website." That is an infrastructure investment, not an SEO tactic or a GEO tactic.

๐Ÿ” WHY TRADITIONAL SEO PREDICTS AGAINST AI CITATION

This is counterintuitive, but it makes sense when you look at what traditional SEO actually optimizes for.

Pages that score high on traditional SEO tend to be marketing-optimized content. Corporate blog posts. SEO agency output. Pages built for Google's crawler, not for substantive information delivery. They have schema markup because someone checked the "add schema" box. They have author bios because E-E-A-T guidelines say to. They have lots of internal links because site architecture audits recommend it.

AI platforms don't read schema markup to decide what to cite. They don't look for author bios. They don't count internal links. They read the content. And marketing-optimized pages tend to be thin on substance. They answer the question at a surface level, padded with SEO-friendly formatting.

What AI platforms reward is substantive content: pages with real technical depth, backed by data, written clearly enough to be useful, and organized with a structure that makes information easy to extract. These are the features in the GEO bucket, and they predict citation because they proxy for the kind of content that actually answers questions well.

๐Ÿค” IS GEO A SCAM?

No. The features marketed as "GEO" carry real predictive signal. The likelihood ratio test is significant. The effect is stable across 10 random samples. Pages with high GEO scores are cited 22 percentage points more often than SEO-optimized pages without GEO features.

But GEO consulting as typically sold is a different story. If an agency's GEO package focuses on schema optimization, E-E-A-T signals, llms.txt implementation, and keyword optimization for AI, our data shows those tactics are either irrelevant or counterproductive for AI citation. Schema markup is 0.12% of importance. Author attribution is 0.36%. These are not going to move the needle.

If the consulting focuses on page speed, technical content depth, and readability, it might actually work. The problem is not the category "GEO." The problem is which specific tactics consultants implement.

The genuinely useful GEO advice (write with technical depth, include real data, structure clearly, optimize readability) is indistinguishable from "write well." It has always been good advice. What is new is that AI platforms reward it more purely than Google does, because AI doesn't need backlinks, schema, or author bios to assess quality. AI reads the content directly.

The $50,000 Question

If a business pays $50,000 for GEO consulting that focuses on schema markup, author bios, and structured data, our data predicts zero improvement in AI citation. Those features have zero or negative association with being cited.

If the same money went toward page speed optimization, hiring subject-matter experts to write technically substantive content, and improving content readability, it could produce real results. The investment is the same. The tactics determine the outcome.

โœ… THE PRACTICAL SPLIT

Here is where your optimization budget should go, based on 4,375 pages of data:

1. Page speed and site infrastructure (65.8% of importance)

Be a well-built site. Fast load times, clean HTML, professional infrastructure. This is the single biggest factor and it is traditional web engineering, not SEO or GEO.

2. Content substance (23.4% of importance)

Write with technical depth. Include real data. Use clear headings. Optimize for readability. This is what the GEO industry calls "GEO." It is also what writing teachers call "good writing."

3. Everything else (10.8% of importance)

Internal links, word count, heading optimization. Traditional SEO that barely matters for AI. Do it for Google if you want, but don't expect it to help with AI citation.

What to stop doing

  • Stop optimizing schema markup for AI citation. It's 0.12% of importance.
  • Stop adding author bios for AI citation. It's 0.36% of importance, and the direction is negative (OR = 0.81).
  • Stop treating E-E-A-T signals as AI optimization. AI platforms don't read your schema or your author page. They read your content.

What to keep doing

  • Fix your page speed. Or more precisely: invest in being a well-built site.
  • Write substantive content. Technical terminology, real data, clear structure. Not because AI detects these features, but because substantive content is what AI platforms cite.
  • Focus on the technical features that predict citation from our prior research.

๐Ÿ”ฌ ROBUSTNESS CHECKS

Stability Across Random Samples

We resampled 10 times with different random seeds (balanced at 1,855 vs 1,855):

Model Mean AUC Standard Deviation
SEO-only RF 0.818 0.005
GEO-only RF 0.722 0.004
SEO+GEO RF 0.822 0.002
GEO improvement +0.0035

The +0.35% AUC improvement from GEO features is small but consistent across all 10 seeds. Not a sampling artifact.

The Score Partition Problem

A fair critique: the boundary between "SEO" and "GEO" is subjective. Why is heading density (H2+H3 per 1k words) a GEO feature while heading score (raw H2 count) is SEO?

We partitioned based on how features are marketed, not on their underlying nature. H2 count is in every traditional SEO audit. Heading density (per 1k words, optimized for extractability) is language specific to GEO guides. Similarly, "has schema markup" is classic SEO; "statistics density per 1k words" is Princeton GEO paper advice.

A different partition could shift the percentage split. But it wouldn't change three things: (1) speed dominates everything, (2) schema and author features are near-zero regardless of which bucket they're in, (3) content-substance features carry real signal regardless of which bucket they're in. The partition determines the label, not the signal.

Content Quality as the True Mechanism

The GEO features that work (technical terminology, readability, statistics density, heading structure) are all content-quality proxies. Pages written by subject-matter experts naturally use technical vocabulary. Pages written clearly score well on readability. Pages backed by real research contain statistics.

This means the +22.2 percentage point "GEO marginal value" is better described as a content quality effect. We labeled it "GEO" because that's how the industry markets these features, but the underlying mechanism is the same one that has made good writing effective since long before SEO existed.

This does not invalidate the finding. It reframes it. The question "Is GEO real?" becomes "Do AI platforms reward content quality beyond what Google rewards?" The answer is yes. But calling this "GEO" and charging a premium for it is like calling "write well" a new discipline.

Vertical Skew

The VPS citation data (42% of total dataset) is heavily weighted toward skincare/beauty and marketing agency queries. High-trust verticals (health, finance, legal) are underrepresented. Citation density may matter more in high-trust verticals, but sample sizes are insufficient to confirm. This is a limitation worth noting.

๐Ÿ“š REFERENCES

๐ŸŽฏ THE BOTTOM LINE

GEO is not a scam. But most of what is sold as GEO doesn't work.

Traditional SEO predicts against AI citation. Pages optimized for Google's crawler (schema, author bios, internal link volume) are the least likely to be cited by AI when they lack content substance.

GEO features predict for AI citation. Technical terminology, readability, statistics density, and heading structure add 22 percentage points of citation probability. This is real and stable.

But the mechanism is content quality, not GEO. The features that work are proxies for substantive, well-written content. The features that don't work (schema, author bios, structured data) are the ones agencies most commonly sell.

Page speed is 65.8% of the story. (Corrected: Experiment K proved this was domain identity fingerprinting. Domain authority, not page speed, is the real infrastructure signal. The only valid speed advice: avoid being extremely slow, above 5 seconds.)

The GEO industry is not selling snake oil. But most of what it sells is the wrong part of the shelf. The useful part (write substantively, include data, structure clearly) has always been good advice. What AI platforms have done is strip away the SEO scaffolding (backlinks, schema, E-E-A-T signals) and reward content quality more directly.

If you want AI to cite your pages, write pages worth citing. Build domain authority over time. Everything else is a rounding error.