Glossary

AI search glossary.

The terms behind generative engine optimisation, written plainly so a small business owner can use them in conversation. Definitions cover the models, the schema types, and the technical bits that decide whether ChatGPT, Perplexity, Claude or Google AI Overviews cite your site. Updated 20 April 2026.

A B C D E F G H I J K L N O P R S T V W

A

AEO (Answer Engine Optimisation)

AEO is the practice of structuring content so an answer engine, the part of an AI tool that returns a single response, picks your page as the source. It overlaps with traditional SEO but prioritises clear question and answer pairs, schema markup, and machine-readable facts over keyword density and link volume.

See also: GEO, FAQPage schema, GEO explained.

AI Overviews

AI Overviews are the generated answers Google places above the blue links on many search results pages. They quote sources from the regular Google index and link out to those pages. For small businesses, being cited inside an AI Overview is now a separate visibility goal from ranking on the list below.

See also: Google AI Overviews, Google AI Overviews for small business.

Agent (AI agent)

An AI agent is a model wrapped in tools, so it can browse the web, fill forms, run code, or call APIs on its own. Agents matter for SEO because they read pages directly, follow links, and complete tasks without a human reading the page, so structured data and clean navigation count more than visual design.

See also: Crawler, API.

Anthropic

Anthropic is the AI research company behind Claude. Founded in 2021 by former OpenAI staff, it sells Claude through a chat product, an API, and integrations such as Claude in Chrome. For citation work, Anthropic matters because Claude uses its own crawler, ClaudeBot, and its own ranking choices.

See also: Claude, How to get cited by Claude.

API

An API, application programming interface, is the machine entry point to a service. AI models expose APIs so other software can send a prompt and receive a response without a chat window. For SEO, APIs matter because retrieval tools, agents, and analytics scripts use them to fetch and reuse your content programmatically.

See also: Agent, RAG.

Authority signals

Authority signals are the cues a model uses to decide whether to trust a source. They include domain age, third-party citations, named author bios, consistent business data across the web, and presence in editorial publications. AI models combine many weak signals rather than relying on any single one.

See also: E-E-A-T, Source diversity.

B

Bing Copilot

Bing Copilot is Microsoft's AI-powered search experience, built on the Bing index and OpenAI models. It also powers ChatGPT search results. To appear, your site must be indexed in Bing, which means submitting a sitemap to Bing Webmaster Tools, not only to Google Search Console.

See also: ChatGPT Search, Sitemap.xml.

BlogPosting schema

BlogPosting is a schema.org type that marks up a blog post with author, headline, date published, date modified, and image. It is a sub-type of Article. Adding BlogPosting JSON-LD to your posts gives AI models clean metadata, which raises the chance of an accurate citation.

See also: Schema.org, JSON-LD, Schema markup for AI search.

C

Citation (AI citation)

An AI citation is the moment a model names your business or links to your page inside a generated answer. It is the AI search equivalent of ranking on Google. Citations may appear as inline numbers, footnotes, or named recommendations, depending on the model and the query.

See also: Citation rate, Why citations are not moving.

Citation rate

Citation rate is the percentage of test prompts in which an AI model cites your business by name or links to your URL. Most audits use a fixed prompt set, run it across ChatGPT, Perplexity, Claude and Google AI Overviews, then track the rate over time as the core visibility metric.

See also: Prompt, Citation.

Claude

Claude is the AI assistant built by Anthropic. It includes web search, can read uploaded files, and runs from chat, API, and the Claude in Chrome extension. Claude tends to cite a narrower set of sources than ChatGPT, so consistent entity data and editorial mentions matter for being named.

See also: Anthropic, How to get cited by Claude.

Crawler

A crawler is an automated program that fetches web pages so a search engine or AI model can index them. Major AI crawlers include GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, and Googlebot. If you block them in robots.txt or with a firewall, your content cannot be cited.

See also: Agent, Sitemap.xml.

D

DefinedTerm

DefinedTerm is a schema.org type for a single defined word or phrase. Used on glossary pages, it lets AI models extract a clean term and definition pair. A DefinedTermSet groups many DefinedTerm objects together, which is how this glossary marks itself up.

See also: Schema.org, JSON-LD.

Direct answer paragraph

A direct answer paragraph is a short block of prose, usually 40 to 60 words, that answers a question in full at the start of a section. AI models lift these paragraphs verbatim into generated answers, so writing one under each H2 raises the chance of being quoted.

See also: FAQPage schema, Fan-out.

E

E-E-A-T

E-E-A-T stands for Experience, Expertise, Authoritativeness, and Trust. It is Google's framework for judging content quality, used by human raters and reflected in algorithm signals. AI models inherit this bias, since they draw on Google's index, so author bios, real credentials, and named sources help citations.

See also: Authority signals, Entity.

Embedding

An embedding is a numeric representation of a piece of text, captured as a long list of numbers. Models use embeddings to compare meaning, so a query about kitchen fitters can match a page about bespoke joinery even with no shared words. Embeddings sit behind retrieval, search, and recommendation.

See also: Vector database, RAG.

Entity

An entity is a thing the model recognises as a distinct concept, a business, a person, a product, a place. Entity clarity means the model has consistent data about you across schema, Google Business Profile, directories, and editorial mentions, so it can cite you with confidence.

See also: NAP, Knowledge graph.

F

FAQPage schema

FAQPage is a schema.org type that marks up a list of question and answer pairs on a single page. AI models use FAQPage data to extract answers verbatim, often quoting them inside generated responses. It works best when each question is one a real customer would type.

See also: Schema.org, JSON-LD, SEO for AI FAQ.

Fan-out (query fan-out)

Query fan-out is a technique where the model breaks one user question into several sub-questions, runs each as a separate search, then merges the results. It is how Google AI Overviews handles complex queries. To benefit, publish pages that answer narrow sub-questions, not only broad topics.

See also: Google AI Overviews, Direct answer paragraph.

G

GEO (Generative Engine Optimisation)

GEO, generative engine optimisation, is the umbrella term for making content discoverable and quotable inside AI-generated answers. It includes schema markup, entity consistency, FAQ-shaped content, and presence in third-party sources. Some practitioners use GEO and AEO interchangeably, others reserve AEO for answer-only experiences.

See also: AEO, GEO explained.

Gemini

Gemini is Google's family of AI models, used inside Google Search AI Overviews, the standalone Gemini chat product, and Google Workspace. For citation work, Gemini matters because it powers most of Google's generative answers, so the same signals that help in regular Google search apply here.

See also: Google AI Overviews, LLM.

Google AI Overviews

Google AI Overviews are the generated answer blocks at the top of many Google search results. They draw on Google's regular index and cite source URLs. To appear, a page must be indexed by Googlebot, have clear structure, and answer the underlying user intent in a single short passage.

See also: AI Overviews, Google AI Overviews for small business.

Grounding

Grounding is the practice of forcing a model to cite real sources rather than invent answers from training data. It works by giving the model live search results or a document set at query time. AI search products use grounding to reduce hallucinations and to surface verifiable URLs.

See also: Hallucination, RAG.

H

Hallucination

A hallucination is a confident but false statement produced by a language model. In AI search, hallucinations show up as wrong opening hours, made-up phone numbers, or businesses that do not exist. Clean schema, consistent entity data, and grounding all reduce the rate.

See also: Grounding, NAP.

HowTo schema

HowTo is a schema.org type that marks up step-by-step instructions, with optional tools, materials, time, and cost. It maps cleanly onto how AI models present procedural answers, which makes it useful for trade businesses, repair guides, and any page that walks a user through a task.

See also: Schema.org, JSON-LD.

I

IndexNow

IndexNow is an open protocol that lets a site notify search engines the moment a page is added, updated, or removed. Bing, Yandex and several smaller engines support it. Since ChatGPT uses Bing, pinging IndexNow on every publish shortens the gap between an edit and an AI citation.

See also: Bing Copilot, Sitemap.xml.

Information retrieval

Information retrieval is the field of finding the right document for a given query. AI search bolts a language model on top of a classic retrieval system, so the model can rephrase results into a single answer. Good retrieval still depends on indexing, ranking, and clean source data.

See also: RAG, Re-ranking.

J

JSON-LD

JSON-LD, JavaScript Object Notation for Linked Data, is the format Google and most AI models prefer for structured data. It sits in a script tag in the page head, separate from the visible copy. JSON-LD lets you state facts, business name, address, services, in a form a machine can read without ambiguity.

See also: Schema.org, Schema markup for AI search.

K

Knowledge graph

A knowledge graph is a database of entities and the relationships between them. Google's Knowledge Graph powers the panel on the right of branded search results. AI models use similar structures to keep facts straight, so consistent entity data across your site and the wider web keeps the graph accurate.

See also: Entity, NAP.

L

LLM (Large Language Model)

An LLM is a model trained on a large body of text to predict the next word in a sequence. ChatGPT, Claude, Gemini, and Perplexity's response engine are all LLMs. They generate answers by combining patterns from training data with live search results, when grounding is enabled.

See also: Token, Grounding.

LocalBusiness schema

LocalBusiness is a schema.org type for any business with a physical or service location. It carries name, address, phone, hours, and geo coordinates. For small businesses, it is the single highest-leverage schema block, since it gives every AI model the facts it needs to answer near me queries.

See also: NAP, ProfessionalService schema.

llms.txt

llms.txt is a proposed text file at the root of a site that tells AI models which pages to read and how to interpret them. It is not yet a formal standard, and adoption by major models is partial. Publishing one costs nothing and may help with retrieval-style products.

See also: Sitemap.xml, Crawler.

N

NAP (Name, Address, Phone)

NAP stands for Name, Address, Phone. Consistent NAP across your website, Google Business Profile, directories, and review platforms is the baseline for local AI citations. A mismatch between sources causes models to either pick one at random or skip your business in favour of a clearer competitor.

See also: LocalBusiness schema, Entity.

O

Organization schema

Organization is a schema.org type used on the homepage of any company that is not strictly a local business. It carries name, logo, sameAs links to social profiles, contact details, and founder. AI models read it to confirm who runs the site, which feeds entity recognition and trust signals.

See also: Schema.org, LocalBusiness schema.

P

Perplexity

Perplexity is an AI search engine that returns a written answer with inline numbered citations. It uses its own crawler, PerplexityBot, plus partnerships with editorial publishers. For small businesses, Perplexity tends to cite specific URLs rather than brand names, so deep pages with clear answers do well.

See also: Citation, How to get cited by Perplexity.

Prompt

A prompt is the text a user sends to an AI model. For SEO work, the prompts that matter are the ones a real customer would type, who fits bathrooms in Edinburgh, best vegan bakery near Bath. Test prompts are the unit of measurement for citation rate audits.

See also: Citation rate.

ProfessionalService schema

ProfessionalService is a sub-type of LocalBusiness for service providers such as accountants, solicitors, consultants, and clinics. Use it instead of generic LocalBusiness when the business sells expertise rather than goods. The properties are the same, name, address, phone, hours, plus services offered.

See also: LocalBusiness schema, Schema.org.

R

RAG (Retrieval Augmented Generation)

RAG, retrieval augmented generation, is the pattern where a model fetches documents from a search index, then uses them to write an answer. AI search products are large RAG systems. For your site to be the retrieved document, it needs to be indexed, structured, and answerable in short passages.

See also: Information retrieval, Vector database.

Re-ranking

Re-ranking is the second step in a retrieval system, where an initial list of candidate pages is reordered by a more careful model. It decides which sources actually get quoted in an AI answer. Authority signals, freshness, and entity match all influence re-ranking outcomes.

See also: Authority signals, Information retrieval.

S

Schema.org

Schema.org is the shared vocabulary for structured data on the web, maintained by Google, Microsoft, Yahoo and Yandex. It defines types such as LocalBusiness, FAQPage, and Article, plus the properties each type carries. Almost all AI search citation work starts with picking the right schema.org types.

See also: JSON-LD, Schema markup for AI search.

Sitemap.xml

Sitemap.xml is a file at the root of a site that lists every page you want crawled, with last modified dates. Submitting it to Google Search Console and Bing Webmaster Tools is the simplest way to get new pages discovered. Without indexing, no AI model can cite the page.

See also: Crawler, IndexNow.

Source diversity

Source diversity is the spread of independent sources that mention your business. AI models prefer to cite an entity that appears in several distinct places, your website, a directory, an editorial article, a review platform, over one that lives on its own domain. It works against thin or astroturfed signals.

See also: Authority signals, Backlink.

T

Token

A token is the unit a language model reads, usually a short word or part of a word. Models charge by token, count context windows in tokens, and limit response length the same way. For SEO, tokens matter because shorter, denser pages waste fewer tokens and get summarised more accurately.

See also: LLM, Embedding.

V

Vector database

A vector database stores embeddings and lets you search them by similarity rather than by keyword. RAG systems use vector databases to find passages that match a user query in meaning. You will not run one for your own site, but it is the engine behind most AI search products.

See also: Embedding, RAG.

W

Web crawl tier

Web crawl tier is the band a search engine or AI model assigns to your site, based on freshness, authority, and prior usefulness. Top-tier sites get crawled in minutes, lower tiers in days or weeks. Submitting sitemaps, pinging IndexNow, and earning citations all push a site up.

See also: Crawler, IndexNow.

Want this audited on your site?

Get an AI Visibility Audit, $197, delivered in 5 working days. Or run a free check first.

Get the AI Visibility Audit Run a free check first