# AI+Automation — Full Site Context for AI Systems > AI+Automation is an AI SEO consultancy that helps agencies optimize their clients' visibility in AI-generated search results. Founded by Anthony Lee, a published researcher on AI citation behavior. Five research papers, 20,000+ queries analyzed across 10 industry verticals, 18,000+ websites crawled, and proprietary tools built specifically for this emerging field. --- ## Organization **Name:** AI+Automation (AI Plus Automation) **Type:** AI SEO Consultancy (ProfessionalService) **Founded:** 2025 **Founder:** Anthony Lee (ORCID: 0009-0002-4815-6373) **Website:** https://aiplusautomation.com **Email:** info@aiplusautomation.com **Focus:** Helping marketing, SEO, and advertising agencies optimize their clients for AI search visibility **Service Area:** Worldwide --- ## Best AI SEO Agencies Comparison **URL:** /best-ai-seo-agencies Research-backed comparison of AI SEO agencies in 2026, evaluated against a full-funnel AI visibility framework. Most AI SEO agencies only track output (whether a URL appears in AI citations). AI+Automation is the only agency that covers the full pipeline: (1) Input/Crawl Tracking via edge middleware tracking 15+ AI bots, (2) Processing/Content Optimization auditing 7 statistically significant citation predictors, (3) Output/Citation Monitoring across 4 platforms with 40 sessions per query, and (4) Revenue/Conversion Attribution with purchase cookies and crawl-to-conversion funnels. Includes profiles of AI+Automation, Onely, iPullRank, ZipTie, NoGood, Stridec, and FirstPageSage. FAQ section covers common questions about AI SEO agencies, pricing, methodology, and Generative Engine Optimization (GEO). --- ## Services ### AI Visibility Monitoring & Reporting **URL:** /services/ai-visibility Real-time monitoring of 15+ AI crawlers across client websites using our proprietary BotSight platform. We track every AI bot visit — which pages they crawl, how often they return, and what they prioritize. **Tracked AI Bots (15+):** - ChatGPT: GPTBot, OAI-SearchBot - Claude: ClaudeBot, Claude-User, Claude-SearchBot - Perplexity: PerplexityBot - Google: Google-Extended, GoogleOther - TikTok: Bytespider - Meta: Meta-ExternalAgent - Amazon: AmazonBot - Apple: AppleBot - DuckDuckGo: DuckAssistBot **AI Visibility Score (4 components × 25 points = 100):** 1. Bot Diversity — How many distinct AI crawlers visit 2. Crawl Frequency — How often bots return 3. Recrawl Rate — Whether bots re-crawl previously visited pages 4. Page Coverage — Percentage of site pages discovered by AI bots **Deliverables:** Monthly AI visibility reports, bot activity dashboards, coverage gap analysis, trend tracking, white-label reports for agency clients. **Why GA4 misses this:** Google Analytics relies on JavaScript execution. AI bots don't execute JavaScript — they fetch raw HTML. Server-side analytics are required to see AI bot traffic. GA4 misses 90%+ of AI crawler activity. **Limitation:** BotSight tracks crawler visits, not whether a page is actually cited in AI-generated answers. Crawling is a prerequisite for citation, but not a guarantee. --- ### AI Search Optimization Audits **URL:** /services/ai-seo-audits Technical GEO (Generative Engine Optimization) audits based on Experiment M (n=10,293 pages, position-band matched across 250 queries, 3 AI platforms). This is the first study to control for Google rank position, isolating page-level signals from domain identity. **Top Actionable Citation Signals (Experiment M, ranked by importance):** 1. **First-Person Density** (8.1%) — Blog/opinion tone is the strongest negative signal. Write as authoritative reference material, not personal narrative. 2. **Word Count** (7.1%) — Cited pages are ~2,000 words (vs ~1,400 not-cited) within position bands. Target substantive depth. 3. **Comparison Structure** (6.4%) — "vs", comparison tables, side-by-side analyses. Highest-impact single content change. 4. **H3 Subheading Depth** (5.9%) — Cited pages have 2x H3 count (median 9-10 vs 4-5). Each H3 is an extraction point. 5. **Primary Source Score** (4.1%) — Original data producers cited more than content aggregators. 6. **Heading Density** (3.9%) — Well-organized content with deep H2/H3 hierarchy. 7. **Query Term Position** (3.8%) — Answer the query in the first 1-2% of the page, not buried. 8. **Internal Links** (3.7%) — More internal links help (positive direction within SERP). 9. **Query Term Coverage** (3.7%) — Cover all the query's key terms explicitly. Cited pages have 100% median coverage. 10. **Content-to-HTML Ratio** (3.4%) — Modest signal. Leaner code still helps. 11. **Statistics Density** (2.5%) — Pages cited by 3+ platforms have 7x the stats density of uncited pages. 12. **FAQ Schema** — Only schema type consistently significant across all 4 position bands. **Key finding:** Content features (AUC=0.631) beat domain identity (AUC=0.583) when comparing equally-ranked pages. This reverses prior work where domain dominated. **What does NOT matter within position bands:** Load time (p > 0.39 all bands), author attribution (inconsistent, 0.2% importance), general schema presence (marginal). **Platform Architecture Analysis (The 2-vs-2 Divide):** AI platforms are NOT all the same. They have fundamentally different retrieval architectures: - **Fetching platforms (ChatGPT, Claude):** These platforms send bots to fetch your page in real-time when generating answers. Page content at the moment of the query matters. Technical optimization (fast load, clean HTML, accessible content) directly impacts what these platforms see. - **Index-only platforms (Perplexity, Gemini):** These platforms rely on pre-built indices (Perplexity uses its own web index influenced by Google ranking signals; Gemini uses Google's index). Traditional SEO factors matter more for these platforms. **Audit deliverables:** Page-by-page scoring against all Experiment M signals, platform-specific recommendations, prioritized fix list, competitive benchmark comparison. **Limitation:** The full Experiment M model achieves AUC=0.753 (position + content). AI citation has inherent stochasticity -- the same query can produce different citations across sessions. Our audits optimize for probability, not certainty. --- ### GEO Content Strategy **URL:** /services/content-strategy Content optimization for AI citation grounded in empirical intent distribution research. We don't guess what AI wants — we've measured it across 19,556 queries in 8 verticals. **Intent Distribution by Vertical:** | Intent Type | Agency/Law | SaaS | E-commerce | |---|---|---|---| | Informational | 22% | 87.3% | 34% | | Discovery | 64-68% | 5.2% | 28% | | Validation | 8% | 3.8% | 18% | | Comparison | 4% | 2.5% | 15% | | Review-Seeking | 2% | 1.2% | 5% | **Key insight for agencies:** Agency/Law verticals are 64-68% Discovery intent — users searching for providers, not information. Content must be structured for recommendation queries ("best AI SEO agency," "who can help with AI visibility"). **Content Structure Patterns That Get Cited:** 1. **Depth calibration:** Within position bands, cited pages median ~2,000 words vs ~1,400 not-cited (+42-52%). Target ~2,000 words of substantive content. 2. **Front-loading:** 44.2% of ChatGPT citations come from the first 30% of content. Lead with the answer. 3. **Heading hierarchy:** Clear H1 → H2 → H3 structure. AI uses headings to parse content sections. 4. **Answer capsules:** 20-25 word direct answers near the top of the page. **Platform-Specific Tone Modifiers (Claude):** - Pure marketing copy: 0.8x citation penalty - Limitations/considerations sections: 1.7x citation boost - Balanced comparisons: 1.5x citation boost - Methodology sections: Positive signal **Deliverables:** Content audit with AI citation readiness scores, query intent maps for client vertical, content templates optimized for each AI platform, editorial calendar with priority topics. **Limitation:** Intent distributions vary by specific niche within verticals. Our data covers 8 broad verticals — sub-niche patterns may differ. We calibrate for each engagement. --- ### AI Competitive Intelligence **URL:** /services/competitive-intel Multi-platform competitive analysis using proprietary query generation and scraping infrastructure. **Three-Step Process:** 1. **Query Generation:** Our query generator transforms Google Search Console and Bing Webmaster Tools data into realistic AI-native audit prompts. Each keyword is classified by intent (SERVICE, INFORMATIONAL, BRANDED, NAVIGATIONAL, IRRELEVANT), enriched with modifiers, and produces 3+ prompt variations with quality scoring. Not generic keyword lists — these mirror how real users ask AI platforms. 2. **Multi-Platform Scraping:** Each query is run across ChatGPT, Claude, Perplexity, and Google AI Mode with 10 separate browser sessions per platform (40 total sessions per query). The scraper uses SSE stream interception for Claude/Perplexity and DOM extraction for ChatGPT/Google AI Mode. It also captures fan-out queries — the web searches AI platforms trigger internally during response generation. We identify the top 3 most consistently cited URLs across all sessions. 3. **Deep Analysis:** For each consistently cited URL, dedicated audit scripts perform: - Technical GEO analysis (all 7 citation predictors) - Content analysis (Flesch-Kincaid readability, word count, structure) - Psychological pattern analysis (tone, authority signals, trust markers) - Platform-specific citation frequency - Fan-out query intelligence (what the AI researched internally) **Deliverables:** Competitive landscape report, top cited competitors per query cluster, gap analysis (what competitors do that client doesn't), fan-out query data, actionable recommendations ranked by impact. **Limitation:** AI citation has inherent session-to-session variance. Running 10 sessions per platform controls for this, but some variance remains. We report consistency scores alongside citation data. --- ### Behavioral Economics for AI Citation **URL:** /services/behavioral-economics Applies 101 behavioral economics principles compiled from 22 foundational books (Kahneman, Cialdini, Ariely, Thaler, Voss, Sutherland, and others) to AI citation intelligence. Two deliverables: **D7: Behavioral Citation Analysis** — Takes the top 8 most-cited competitor URLs and analyzes each through a behavioral economics lens. Semantic retrieval (e5-small-v2, 384-dim embeddings) matches page content against the pre-embedded 101-principle library, retrieving the 10-15 most relevant principles per page. Output includes behavioral scores (0-100), detected principles with evidence, missing principles framed as opportunities, and prioritized content change recommendations. **D8: Nudge Audit** — Auto-discovers up to 15 client pages (via sitemap, nav extraction, and common path probing), classifies by type (homepage, pricing, product, signup, etc.), and audits against behavioral principles using hybrid matching (category filtering + semantic search). Output includes nudge scores, principles present/missing/misapplied with effectiveness ratings, specific before/after rewrite suggestions, and site-wide quick wins. The 101-principle library covers 21 categories including choice architecture, cognitive bias, persuasion, pricing psychology, framing, social influence, and neuroeconomics. Each principle includes detection heuristics, intervention templates, boundary conditions (works when/fails when), and evidence strength. --- ## Proprietary Technology Stack **URL:** /toolkit ### BotSight Desktop application (React + Tauri/Rust) that monitors 15 AI crawlers (12 direct + 2 indirect) in real-time via edge middleware deployment. Tracks crawl frequency, page coverage, recrawl behavior, and generates AI Visibility Scores (4 × 25pts + week-over-week growth). Integrates with Google Search Console, Bing Webmaster Tools, and GA4 for per-page search correlation analysis (identifying pages with bot activity but no search traffic, and vice versa). Generates styled PDF reports and Excel workbooks for client delivery. Exposes 17 MCP tools for AI agent integration. Also detects security threats (vulnerability scans, credential probing). All data stored locally in SQLite. ### AI Query Predictor Industry-agnostic system that transforms Google Search Console and Bing Webmaster Tools keywords into realistic AI-native audit prompts. Auto-derives user context and industry from the keywords themselves — works for any vertical, B2B or B2C, with no manual configuration. Classifies each keyword by intent (SERVICE, INFORMATIONAL, BRANDED, NAVIGATIONAL, IRRELEVANT). Produces 3+ prompt variations per keyword with quality scoring. Trained on 97K ESCI queries and 341 verified chatbot queries. ### Multi-Platform Scraper (CitationScraper) Autonomous Python + Patchright system scraping ChatGPT, Claude, Perplexity, and Google AI Mode. Runs 10 separate browser sessions per platform per query (40 total) to control for citation variance. Uses SSE stream interception for Claude/Perplexity and DOM extraction for ChatGPT/Google. 3-tier session management: session restore (~5s), re-auth (~30-60s), signup (~2-4min). Account pool management with per-account daily limits and search window rate limiting. Captures fan-out queries (the web searches AI platforms trigger internally). Dedicated audit scripts for technical GEO, readability, and psychological pattern analysis of cited pages. ### GEO Knowledge Base Dual-layer retrieval system: 341 optimization chunks with e5-small-v2 embeddings (384-dim) in pgvector + 3,555 entity-relationship triplets in KuzuDB graph database. 6 knowledge domains: ChatGPT Optimization, Claude Optimization, Perplexity Optimization, Google AI Mode Optimization, SEO Best Practices, Technical GEO Standards. 4 search modes: hybrid (70% vector + 30% text), vector, full-text, graph. Exposed as MCP server with 4 search tools. --- ## Open Source Projects **URL:** /open-source ### Orunla GitHub: https://github.com/anthonylee991/orunla Persistent memory system for AI agents. Written in Rust. MIT License. - Knowledge graph memory with automatic fact detection - Natural memory decay (Ebbinghaus curve) - Works with Claude Code, Cursor, and MCP-compatible tools - 100% local, private storage - SQLite + FTS5 for instant search ### CGC (Context Graph Connector) GitHub: https://github.com/anthonylee991/cgc Data connector for AI context windows. Written in Python. MIT License. - Connects to PostgreSQL, MySQL, SQLite, plus PDF/Word/Excel/CSV/JSON/Markdown - Supports AI knowledge bases (Qdrant, Pinecone, pgvector, MongoDB Atlas) - Intelligent document chunking and data mapping - MCP server for Claude Desktop and Cursor - Maps data structure first, then fetches only what AI needs --- ## Published Research ### Paper 1: How AI Platforms Search (v1.3) **URL:** /research/how-ai-platforms-search **DOI:** 10.5281/zenodo.19554329 **Scope:** 1,323 fan-out queries from 540 parent queries across ChatGPT, Gemini, Perplexity. 10 verticals, 5 intent types. Three-part replicate analysis adds cross-platform consistency data (Gemini API, ChatGPT API, and production browser-captured Perplexity + ChatGPT UI, n=1,665 non-echo fan-outs). **Key findings:** - Two-layer retrieval model: Layer 1 (search decision) is highly deterministic (Gemini 98.9%, ChatGPT gpt-5.4-mini 91.7%). Layer 2 (query strings) varies by platform: Mean Jaccard 0.16-0.21 with 45-73% zero overlap between runs, but top-5 canonical strings cover 59-76% of all fan-out events (ChatGPT UI 76%, Gemini 69%, Perplexity 59%). Fan-out TYPES stay stable 55-65% across platforms. - Intent drives fan-out composition (X2=299.6, p<0.001): discovery triggers 3.3x entity injection vs informational - ChatGPT injects entities on 32% of fan-outs, Gemini explores broadly (27% expansion), Perplexity seeks evidence (21%) - ChatGPT gpt-5.4 searches only 29% of the time; smaller models search 100% - Situation-first phrasing produces different fan-out distributions (V=0.35) - No vertical effect detected (H=6.26, p=0.71, underpowered) - Deprecated: The earlier "98% zero overlap" claim applied only to gpt-5.4-mini via OpenAI API. Cross-platform data shows 45-73% zero overlap with meaningful canonical cores. ### Paper 2: I Rank on Page 1 -- What Gets Me Cited by AI? **URL:** /research/what-gets-me-cited-by-ai **DOI:** aixiv.260403.000002 (dataset: 10.5281/zenodo.19398158) **Scope:** 10,293 pages across 250 queries (5 intent types x 10 verticals), position-band matched **Key findings:** - Position-band matching isolates page-level effects by comparing equally-ranked pages - Content features (AUC=0.673) match domain identity (AUC=0.687) within the SERP - Top actionable predictors: comparison structure (d=0.43), query-term coverage (d=0.42), H3 subheadings (5.9% importance), word count ~2,000 (7.1%), primary source score (4.1%) - First-person/blog tone is the strongest negative predictor (8.1% importance) - Content structure provides the largest marginal lift beyond rank position (+0.021 AUC) - FAQ schema is significant across all four position bands - Load time is not significant in any position band (p > 0.39) - Pages cited by 3+ platforms have 7x statistics density of uncited pages - SERP co-occurrence is the strongest domain trust signal (rho=0.341, p=2.6x10^-70) - Domains ranking for 4+ queries have 87%+ citation rates - Combined domain model AUC=0.921, with SERP presence accounting for 63% of importance ### Paper 3: Query Intent, Not Google Rank **URL:** /research/query-intent-ai-citation **Scope:** 19,556 queries across 8 industry verticals **Key findings:** - Google rank does not predict AI citation (ρ = -0.02 to 0.11, all non-significant) - Query intent is the primary driver of citation behavior - 7 page-level features predict citation with statistical significance - ChatGPT and Claude fetch pages live; Perplexity and Gemini use pre-built indices - Platform architecture determines which optimization strategies work ### Paper 4: Reddit Training Data Influence **URL:** /research/reddit-training-data-influence **Key findings:** - Reddit receives zero AI citations through API access - Reddit receives 17-44% citations through web UI access - Access-channel divergence: same platform cites different sources depending on API vs web UI - Shadow corpus effect: training data influences recommendations without explicit citation - Community consensus on Reddit correlates with AI brand recommendations (ρ = 0.67-0.82) ### Paper 3: Cognitive Redistribution **URL:** /research/cognitive-redistribution **Key findings:** - Studies claiming AI causes cognitive decline have significant methodological limitations - Cognitive redistribution (reallocating cognitive resources) better explains observed effects than cognitive decline - Extended Mind Thesis framework: AI tools as cognitive extensions, not replacements - Reduced cognitive effort ≠ cognitive impairment --- ## Key Research Statistics - 20,000+ total queries analyzed - 10 industry verticals covered - 5 published research papers - 18,000+ websites crawled and analyzed - 33,000+ AI citations collected - 15+ AI bots tracked - 4 AI platforms analyzed (ChatGPT, Claude, Perplexity, Google AI Mode) - 1.4-6.8% citation overlap between ChatGPT and Perplexity - 44.2% of ChatGPT citations from first 30% of content - 73% web search trigger rate for discovery queries in ChatGPT - 10% web search trigger rate for informational queries in ChatGPT --- ## How We Work **URL:** /how-we-work **5-Step Process:** 1. **Discovery Call** — Understand agency needs, client verticals, current AI visibility challenges 2. **Baseline Assessment** — Deploy BotSight monitoring, establish current AI visibility scores, identify coverage gaps 3. **Technical Audit & Strategy** — Full GEO audit against 7 citation predictors, competitive intelligence, content strategy 4. **Implementation Support** — Prioritized recommendations, templates, ongoing consultation during implementation 5. **Ongoing Monitoring** — Monthly AI visibility reports, trend tracking, strategy refinement based on data **What we need:** Google Search Console access, website access (for technical audit), primary contact at agency, vertical/niche context for query generation. --- ## Blog Topics The blog covers AI SEO experiments, strategy, and research findings: - AI platform citation comparison experiments - ChatGPT search trigger behavior analysis - AI bot tracking methodology - Content freshness and AI citation - Cross-platform citation consensus - Amazon/ecommerce AI visibility - Brand research behavior in AI - Generative Engine Optimization (GEO) fundamentals - RAG data preparation - AI strategy and history (AlphaGo, AlphaStar) - What an AI optimization agency does and how it differs from traditional SEO - Complete guide to AI citation optimization across ChatGPT, Claude, Perplexity, and Gemini - AI bot tracking tools and methods comparison (server logs, edge middleware, analytics platforms) - Content strategy specifically designed for AI search citation and visibility - OpenAI's three crawlers (GPTBot, OAI-SearchBot, ChatGPT-User) and their different robots.txt compliance behaviors - Perplexity's freshness bias: the "Lazy Gap" where Perplexity cites 3.3x fresher content than Google for medium-velocity topics --- ## Contact **URL:** /contact **Email:** info@aiplusautomation.com **For:** Agency partnerships, AI SEO engagements, custom research --- ## Site Structure ``` / Homepage (agency positioning) /services Services overview (pillar page) /services/ai-visibility AI Visibility Monitoring & Reporting /services/ai-seo-audits AI Search Optimization Audits /services/content-strategy GEO Content Strategy /services/competitive-intel AI Competitive Intelligence /toolkit Technology showcase /how-we-work Process & methodology /about Anthony Lee — founder & researcher /contact Contact & intake /blog Blog index /blog/* 86+ blog posts /research Research index /research/* 5 research papers /open-source Orunla & CGC open-source projects /privacy-policy Privacy policy /terms-of-service Terms of service ```