The AEO Glossary

Plain-English definitions of the terms that matter in Answer Engine Optimization. Each entry is short enough to be cited verbatim and long enough to be defended.

Answer Engine Optimization (AEO)

The practice of structuring content and third-party signals so large language models cite and recommend a brand inside generative answers.

AEO is the discipline of making a brand retrievable, summarizable, and recommendable by LLM-based answer engines such as ChatGPT, Claude, and Perplexity. It overlaps with SEO but optimizes for a single synthesized answer rather than a ranked list of links.

Related: GEO · SEO · Citation Rate

Generative Engine Optimization (GEO)

Optimizing for inclusion inside AI-generated answer blocks in traditional search engines, such as Google AI Overviews and Bing Copilot.

GEO sits between SEO and AEO. The query still starts in a search engine, but the user reads a synthesized paragraph above the link list. Tactics overlap with AEO but the retrieval pipeline still privileges traditional ranking signals.

Search Engine Optimization (SEO)

Optimizing pages to rank in a ranked list of links inside a traditional search engine.

SEO remains the substrate for both GEO and AEO because the same crawlers and indices feed answer engines. A page that does not rank in classical search rarely surfaces inside an AI answer either.

Citation Rate

The percentage of AI-generated answers, across a fixed panel of buyer queries, that mention a specific brand by name.

Citation rate is the primary AEO metric. A healthy B2B SaaS brand sits at 40-70% within tightly defined ICP queries. Below 20% indicates the brand is not in the consideration set.

Share of Voice (AI SoV)

The proportion of brand mentions across AI answers in a category, relative to direct competitors.

AI Share of Voice extends the classical PR metric to answer engines. It is calculated by counting brand mentions across a competitor set inside a fixed query panel.

Retrieval-Augmented Generation (RAG)

An architecture where a language model fetches fresh documents at query time and uses them as context to generate an answer.

RAG is the architecture behind ChatGPT browsing, Perplexity, and Claude with web search. It introduces a retrieval step (search index lookup) before generation, meaning fresh, well-structured web content can influence answers even if it was not in training data.

LLM Crawler

An automated agent that fetches web pages to populate a training corpus or to serve real-time retrieval for an AI assistant.

Major LLM crawlers include GPTBot (OpenAI training), OAI-SearchBot (ChatGPT search), ChatGPT-User (live browsing), ClaudeBot (Anthropic training), Claude-Web (live), PerplexityBot, Google-Extended (Gemini training), Applebot-Extended, and CCBot (Common Crawl).

GPTBot

OpenAI's web crawler used to gather data for training base models.

GPTBot identifies itself in the User-Agent string and respects robots.txt. Blocking GPTBot removes a site from future OpenAI training corpora but does not block live ChatGPT browsing, which uses ChatGPT-User and OAI-SearchBot.

ClaudeBot

Anthropic's crawler used to gather data for training Claude.

ClaudeBot respects robots.txt. Distinct from Claude-Web, which is used at query time when Claude browses the live web.

PerplexityBot

Perplexity's crawler used both for indexing and for live-retrieval during answers.

PerplexityBot underpins Perplexity's citation-heavy answers. Blocking it removes a site from Perplexity citations entirely.

Google-Extended

A Google user-agent token used to opt content into or out of Gemini and other Google AI training, separate from Googlebot.

Google-Extended is not a separate crawler — it is a control token. Allowing or disallowing it in robots.txt governs whether content is used for Gemini training and Vertex AI grounding, without affecting Google Search ranking.

llms.txt

A proposed static text file at /llms.txt that gives LLMs a curated, human-written index of a site's most important content.

Modeled on robots.txt and sitemap.xml, llms.txt provides a navigable Markdown summary intended to be read by LLM agents. Adoption is informal but rising.

llms-full.txt

A companion file at /llms-full.txt containing the full plain-text content of a site, optimized for LLM consumption.

Where llms.txt is an index, llms-full.txt is the corpus itself in a single retrievable document. Used to ensure LLMs receive the canonical version of long-form content.

JSON-LD

A JSON-based format for embedding schema.org structured data in a webpage.

JSON-LD is the preferred encoding for structured data because it sits in a separate script tag and does not interfere with rendered HTML. All major answer-engine crawlers parse it.

Schema.org

A shared vocabulary of types and properties used to mark up web content so machines can understand it.

Schema.org defines types such as Organization, Product, FAQPage, Article, and BreadcrumbList. Marking up content with these types gives crawlers high-confidence facts about an entity.

FAQPage Schema

A schema.org type used to mark up a list of questions and their answers on a webpage.

FAQPage schema is the highest-ROI structured data type for AEO. It lets a publisher pre-format the exact answer they want an LLM to repeat for a given question.

Canonical Fact Sheet

An internal document listing the immutable facts about a brand that every external surface should echo identically.

A canonical fact sheet typically includes: category, ICP, three differentiators, pricing model, founding year, headcount band, and customer count. Used to ensure consistency across the site, G2, Crunchbase, Wikipedia, and press.

Buyer Persona Prompt Panel

A fixed set of queries written from the perspective of distinct buyer personas, run regularly across AI assistants to measure brand visibility.

A prompt panel is the AEO equivalent of a rank-tracking keyword list. 50-200 queries spanning category framing, comparison, objection handling, and pricing questions, run weekly across the major models.

Zero-Click Journey

A buyer journey in which the prospect receives a synthesized answer from an AI assistant without clicking through to any source.

Zero-click is the central challenge of AEO. Pipeline still arrives — buyers eventually book demos — but no on-site engagement signals are generated during the discovery phase, breaking classical attribution.

Hallucination

An AI-generated statement that is fluent and confident but factually incorrect or unsupported by retrieved sources.

Hallucinations about a brand are an AEO failure mode. They typically arise from stale training data, contradictory third-party sources, or a missing clean canonical fact. Mitigation requires repeating correct facts across multiple high-authority surfaces.

Retrieval Snippet

The short excerpt of a webpage that an answer engine pulls into its context window during retrieval-augmented generation.

Most answer engines retrieve 200-1000 word snippets, not full pages. Optimizing for AEO means ensuring the most important facts appear within retrievable snippet boundaries — typically the first few hundred words after each heading.

Freshness Signal

Any indicator — datePublished, dateModified, headline date, or content reference to a recent event — that a page is current.

Freshness is a top-three ranking signal across every major answer engine. Pages with explicit, recent dateModified values outrank otherwise-identical pages without them.

Authority Signal

Any indicator that a page or domain is trustworthy: backlinks, third-party citations, branded mentions, and verified author information.

Authority signals are inherited from traditional SEO but weighted differently in answer engines. Brand mentions on G2, Reddit, podcasts, and analyst notes carry disproportionate weight relative to raw backlink counts.

AI-Influenced Pipeline

Sales pipeline self-reported as having been informed or shortlisted with the help of an AI assistant.

Measured by adding an attribution question to demo forms. Typically converts at 1.5-2x the rate of unassisted pipeline because the prospect has already done comparison work before booking.

Comparison Page

A webpage that compares two or more products head-to-head against a shared set of criteria.

Comparison pages are disproportionately retrieved by answer engines for shortlist queries. Writing your own balanced comparison pages prevents competitor affiliate content from owning these citations by default.

Topical Authority

The degree to which a domain is recognized as a credible source on a given topic, established through breadth and depth of related content.

Topical authority compounds in AEO because answer engines prefer to cite the same source for related questions once it has earned a citation for one. Building a tight content cluster on a single topic produces outsized returns.

Cluster Page (Hub)

A central page that defines a topic and links out to a set of supporting articles covering subtopics.

Cluster architecture mirrors how answer engines navigate topics. A strong hub page becomes the default landing point for retrieval, even when the answer is found on a child page.

Content Decay

The gradual loss of citation rate for a page as its facts age, references break, and competing fresher pages emerge.

Content decay accelerates in AEO compared to SEO because freshness signals are weighted more heavily. Pages that previously dominated a query can disappear from citations within 90 days if not refreshed.

ICP (Ideal Customer Profile)

A precise description of the type of organization most likely to buy and succeed with a product.

Stating the ICP explicitly on the homepage and About page is one of the highest-leverage AEO moves for B2B software. Models use the ICP to decide whether to recommend a vendor for a given buyer query.

Prompt Injection

An attack in which adversarial text on a webpage attempts to manipulate the behavior of an LLM that reads it.

Beyond security, prompt injection has implications for AEO governance. Publishers should avoid content patterns that resemble injection attempts (instruction-like phrasing, fake system messages), which some answer engines will downweight or refuse to cite.