Justin McKelvey

Justin McKelvey

Fractional CTO · 15 years, 50+ products shipped

AI Visibility 7 min read

LLM SEO: How to Get Large Language Models to Recommend You (2026)

LLM SEO: How to Get Large Language Models to Recommend You (2026)

Quick Answer

LLM SEO is the practice of getting large language models — ChatGPT, Claude, Gemini, Perplexity — to mention and recommend your business, and the strongest lever is brand mentions across the web, not backlinks. Ahrefs' 2026 study of 75,000 brands found web mentions correlate with AI visibility at r=0.664 versus 0.218 for backlinks — a 3:1 gap. Meanwhile, studies covering 25M+ AI citations show 82-94% go to earned and third-party media; brand-owned sites capture only a single-digit share. The playbook: get mentioned on sites LLMs trust, make your own pages extractable, and keep your entity signals consistent everywhere.

Data verified July 2026 · Author: Justin McKelvey — cited by name in Perplexity for his own frameworks

What is LLM SEO, actually?

LLM SEO is optimizing for the answer layer instead of the link layer. When a founder asks ChatGPT "who should I hire to fix my AI-built app?" there's no page two. The model names two or three options and everyone else doesn't exist.

I care about this because I've been on the right side of it. Perplexity names me — by name — when asked about frameworks I've published, like Vibe Debt. Not because I'm a big brand. Because I ran the playbook below on my own site: consistent entity signals, a full schema graph, fact-dense pages, llms.txt, and a citation monitor I built to verify it's working. This post is that playbook, with the receipts.

One scoping note: LLM SEO, answer engine optimization, and GEO are three names for roughly the same discipline — the umbrella I call AI visibility — with different emphasis. This post focuses on the model side — how LLMs decide who to recommend. The AI search optimization guide covers the search-product side.

How do LLMs decide who to recommend?

Two separate systems, and you optimize each differently:

1. Training data (the model's memory). Base models learn from a snapshot of the web. If your brand appears frequently and consistently across that corpus — articles, directories, podcast transcripts, forum threads — the model absorbs you as an entity associated with your niche. This is slow to influence (models retrain on multi-month cycles) but durable once you're in.

2. Live retrieval (the model's search). ChatGPT search, Perplexity, Gemini, and Google's AI Mode fetch current web pages at answer time and cite them. This is fast to influence — a well-structured page that ranks in the underlying index (Bing for ChatGPT; Google for AI Overviews) can get cited within weeks.

Most "LLM SEO" advice fails because it only plays one system. Mentions without extractable pages means you're known but never cited. Extractable pages without mentions means you're citable but never trusted.

Why are brand mentions the #1 predictor?

The single most useful stat in this field: Ahrefs analyzed 75,000 brands and found web mentions correlate with AI Overview visibility at r=0.664 — versus r=0.218 for backlinks. Branded anchor text (0.527) and branded search volume (0.334) also beat raw domain metrics.

This inverts twenty years of SEO instinct. Links were the currency because PageRank counted links. LLMs don't run PageRank — they learn statistical associations from text. An unlinked mention in a podcast transcript teaches the model "this brand ↔ this problem space" just as well as a followed link.

And here's the uncomfortable half: your own website barely counts. Multiple 2026 studies analyzing 25M+ links found 82-94% of AI citations go to earned media — journalists, review sites, communities, directories. Brand-owned sites capture only a single-digit share in most studies; University of Toronto research found AI engines cite third-party sources roughly 5x more than brand sites. Your blog is necessary (it's what gets retrieved when you ARE trusted) but not sufficient. The mentions have to come from elsewhere. I go deep on the how in tracking brand mentions in AI.

What are entity signals and why do they matter?

An entity is what the model thinks you are. "Justin McKelvey" needs to resolve to one consistent thing — fractional CTO, Austin, specific frameworks, specific site — everywhere the model looks. Inconsistency dilutes the association; consistency compounds it.

Practical entity work, in priority order:

  • One canonical name + description used identically across your site, LinkedIn, directories, podcast bios, and bylines.
  • A Person/Organization schema graph with a stable @id, linked to every Article you publish via sameAs and author references. Mine is one consolidated JSON-LD @graph on every page.
  • Named intellectual property. Coin your frameworks. A named concept ("Vibe Debt") is an entity handle the model can attach to you. Generic advice is unattributable by design.
  • Third-party corroboration. The same facts about you appearing on sites you don't own. This is what separates "claims" from "knowledge" in a training corpus.

What did the Princeton GEO study prove?

The GEO paper (Princeton/IIT Delhi, presented at KDD 2024) is the closest thing to controlled science here. The researchers ran 10,000 queries through generative engines and tested nine content modifications. The winners:

Tactic Visibility impact
Adding statistics Up to +40%
Citing sources +30–40% range (top-3 tactic)
Adding quotations +30–40% range (top-3 tactic)
Keyword stuffing Roughly neutral to negative

On Perplexity specifically, optimized content saw visibility improvements up to 37%. The pattern is blunt: generative engines prefer passages that look like evidence. Numbers, named sources, quotable claims. That's why every post on this site opens with a fact-dense Quick Answer block — it's a citation-shaped object, on purpose.

Does llms.txt actually do anything?

Honest answer: unproven, cheap, worth doing. llms.txt is a proposed standard — a markdown file at your domain root indexing your key content for LLM crawlers. As of July 2026, no major AI lab has confirmed it as a ranking input.

I ship both /llms.txt and /llms-full.txt anyway, auto-generated from published posts. The cost was one controller and an hour. Some retrieval crawlers do fetch it, it can't hurt, and the exercise of writing one-line summaries of every page sharpens your own information architecture. File it under "rational bets," not "requirements."

How does structured data feed LLM SEO?

Schema markup is entity resolution infrastructure. JSON-LD gives retrieval systems unambiguous, machine-readable answers to "who wrote this, what is it, who does it describe" — exactly the disambiguation an LLM needs before it will attach a claim to your name.

The stack that matters, in order: Organization and Person (with stable @ids), Article/BlogPosting on every post, FAQPage where you genuinely answer questions, and Service/Offer if you sell something. Validate everything — broken schema is worse than none. The AEO tools guide lists the free validators and generators I actually use.

What should your on-page content look like for LLMs?

Retrieval systems pull passages, not pages. Google's own May 2026 guidance revealed its AI features select self-contained passages of roughly 134-167 words. Write for extraction:

  • Question-formatted H2s with a direct 40–55 word answer immediately after each one.
  • Standalone paragraphs. Every key passage should make sense with zero surrounding context — because that's how it will be consumed.
  • Specific numbers over adjectives. "$29/month as of July 2026" is citable; "affordable" is filler.
  • Dated claims. Recency signals matter to retrieval ranking, and stale answers get displaced.

How do you earn mentions on sites LLMs trust?

Since 82-94% of citations come from earned media, the mention-earning motion is the highest-leverage work in LLM SEO. The lean version, ranked by effort-to-impact:

  • Reverse-engineer your niche's citation set. Ask each engine your money questions and list every third-party domain it cites. That's your target list — not a generic PR list. Usually it's 10-20 specific directories, publications, and communities.
  • Podcasts beat guest posts. Transcripts are training data, hosts introduce you with your exact entity description, and one recording becomes mentions on the host's site, YouTube, and every transcript aggregator.
  • Directories are boring and effective. Industry directories get cited constantly for "best X" queries — Perplexity leans on niche directories especially hard. One afternoon of submissions covers most of them.
  • Answer real questions in communities. Reddit and industry forums are heavily represented in both training corpora and retrieval. Substantive answers with your name attached compound; drive-by link drops get deleted and teach the model nothing.

Measure the effect the same way you measure everything else here — mention tracking monthly, citation checks weekly.

What's the 90-day LLM SEO playbook?

Days 1–14: instrument. Baseline where you stand — ask ChatGPT, Perplexity, and Gemini your 10-15 money questions and log whether you appear. I automated this with a rake task against Perplexity's API; you can start with a spreadsheet. Or skip the setup: my free AI visibility report runs this baseline for your business and hands you the gap list.

Days 15–45: fix owned signals. Schema graph, question-formatted content with quick answers, llms.txt, IndexNow, consistent entity descriptions everywhere. This is the layer you fully control — the AI discoverability checklist covers all of it.

Days 46–90: earn mentions. Podcasts, contributed pieces, directories, communities. Remember the math: 82-94% of citations come from earned media. Target the specific third-party sites already being cited in your niche's AI answers — the monitoring tools in the AI visibility tools comparison (or the free options in my AI SEO tools roundup) will show you which ones those are.

Then re-measure. LLM SEO is a loop, not a launch. And the compounding is real: every mention you earn now is training data for every model shipped after it.

Free Resource Justin McKelvey

Get the Free AI Content Toolkit

A curated selection of the only 3 AI tools you actually need to run a 6-figure consulting business.

Frequently Asked Questions

What is LLM SEO?
LLM SEO is the practice of optimizing your business's online presence so large language models — ChatGPT, Claude, Gemini, Perplexity — mention and recommend you in their answers. It differs from classic SEO because you're optimizing for two systems at once: the model's training data (long-term brand mentions across the web) and its live retrieval layer (extractable, well-structured pages).
What is the biggest ranking factor for LLM visibility?
Brand mentions across the web. Ahrefs' study of 75,000 brands found web mentions correlate with AI Overview visibility at r=0.664 — three times stronger than backlinks at r=0.218. LLMs learn from text about you everywhere, linked or not. A podcast transcript or Reddit thread mentioning your brand does work a backlink can't.
Does llms.txt actually help with LLM SEO?
It's a low-cost, unproven-but-rational bet. llms.txt is a markdown index of your site designed for LLM crawlers, and as of July 2026 no major AI company has confirmed using it for ranking. But it costs an hour to implement, some retrieval-augmented crawlers do fetch it, and it forces you to organize your content clearly — which helps regardless.
How is LLM SEO different from AEO and GEO?
They overlap about 80%. LLM SEO emphasizes influencing the models themselves — training data, entity signals, brand mentions. AEO (answer engine optimization) emphasizes structuring content so answer engines can extract it. GEO (generative engine optimization) is the academic term from the Princeton study. In practice you run one playbook: be mentioned widely, be structured clearly, be retrievable.
How long does LLM SEO take to show results?
Retrieval-layer wins can show up in weeks — Perplexity and ChatGPT search pull live web results, so a well-structured page that ranks in Bing or Google can get cited within a month. Training-data wins take longer: models retrain on multi-month cycles, so brand mentions you earn today may not shape base model answers for six months or more.
Can small businesses compete with big brands in LLM answers?
Yes — better than in classic Google, in many niches. LLMs favor specific, well-documented expertise over domain authority. My one-person consulting site gets named by Perplexity for frameworks I coined, ahead of much larger firms, because the content is fact-dense, structured, and consistently attributed to one entity. Specificity beats size in LLM answers.
Justin McKelvey, Fractional CTO and AI consultant in Austin, TX

Written by

Justin McKelvey

Fractional CTO & AI consultant in Austin, TX. 15 years building software, 50+ products shipped, $53M+ in client revenue generated. I help $1M–$50M founders ship production software and automate operations with AI — without hiring a full-time executive team.

Work with me

If this was useful, here are two ways I can help: