Make Your AEM content visible to AI: A practitioner guide

10 minutes

style

article-header-section

AI agents do not execute JavaScript, so SPA-based Adobe Experience Manager Sites can be invisible in ChatGPT, Perplexity, and Google AI Overviews. This is a practitioner playbook for measuring the citation gap with the AI Content Visibility Checker Chrome extension, and operationalizing the lift at scale with Adobe LLM Optimizer.

Introduction

Ask ChatGPT, Perplexity, or Google's AI Overview about a topic your brand should own, and you may notice something uncomfortable: competitors show up in the answer, and you do not. For many Adobe Experience Manager Sites implementations, particularly Single Page Applications, this is not a content problem. It is an architectural one. AI crawlers do not browse the web the way humans do, and the gap between what they read and what your site publishes is widening fast.

Google's AI Overviews now appear on roughly 25% of queries (Conductor, mid-2025). Similarweb measured 1.1 billion AI-platform referral visits in June 2025 alone — up 357% year over year — while traditional Google referral traffic declined about 6.7% over the same window. Zero-click search reached 69% by May 2025. The teams that move first will own their category in AI answers; the teams that wait will spend the next two years trying to catch up.

Why your pages are invisible — the symptom

You publish a thoughtful page on a topic that your brand should own. It ranks well on traditional Google. The internal SEO dashboard looks healthy. Then someone asks ChatGPT or Perplexity the same question and the answer cites three competitors and a discussion forum. Your page is nowhere.

That mismatch is the symptom. The cause is not that your content is bad. The cause is that the model never read it.

The reason — AI crawlers do not execute JavaScript

In late 2024, Vercel published an analysis of more than half a billion AI crawler requests across its edge network. The headline finding was unambiguous: GPTBot, ClaudeBot, and PerplexityBot do not execute JavaScript at all. ChatGPT's crawler fetches JS files in roughly 11.5% of requests and Claude's in 23.8%, but neither runs them. Of the major AI crawlers, only Googlebot renders JavaScript via headless Chrome — and even that path is slower and less reliable than the initial HTML response.

For an AEM Sites instance property running an SPA front end, this means that the model often sees a shell: a navigation skeleton, a few meta tags, and placeholder containers where the real content should be. The page that ranks beautifully on traditional Google can be effectively blank to an LLM. Any JSON-LD schema injected by client-side JavaScript is invisible for the same reason. The model has no idea what your page is about because the bytes that describe it never reach it.

Verify it on one of your own pages:

Run: curl -A "GPTBot/1.0" https://yoursite.com/your-page and inspect the raw HTML response.
Compare it to the rendered DOM in your browser developer tools.
Anything in the DOM but missing from the curl response is invisible to GPTBot, ClaudeBot, and PerplexityBot.

If the curl output is largely empty, you have an AI visibility gap and the rest of this article is for you.

Step 1: Measure first with the AI Content Visibility Checker

Before you change anything, get a baseline. Install the AI Content Visibility Checker Chrome extension, powered by Adobe LLM Optimizer. It is published by Adobe Inc. on the Chrome Web Store and installs in under a minute.

What it does: it loads the page the way an AI agent would — without your JavaScript, without your hydration, without your client-side schema — and reports a Citation Readability Score. The score is the percentage of the page actually accessible to LLM crawlers. The extension also shows you a side-by-side of the AI agent's view versus a human user's view, so you can see exactly which blocks of content the model is missing.

Important framing: the AI Content Visibility Checker measures readiness; it is the diagnostic. The fixes in Steps 2 through 6 are the editorial and engineering work that has to happen inside Adobe Experience Manager. Once those foundations are healthy, Adobe LLM Optimizer (Step 7) is where you operationalize and scale the lift across an enterprise estate.

Actionable steps:

Install the AI Content Visibility Checker, powered by Adobe LLM Optimizer, from the Chrome Web Store.
Run it on your top 10 highest-value pages — product, solution, thought-leadership, pricing, FAQ.
Record the Citation Readability Score for each page. This is your baseline.
Capture the AI-agent view side-by-side with the human view. Share it with your content team — it is the most persuasive artifact you will ever produce for a GEO investment conversation.

Step 2: Add a noscript AI summary block in your AEM page template

The single fix that delivered the largest measurable Citation Readability Score lift on a live AEM Sites SPA in our experience was the simplest: a server-rendered <noscript> block on every page template containing an AI-readable summary of the page.

The pattern is straightforward. Inside <noscript>, render the H1, a 2 to 3 sentence abstract, a JSON-LD block for the relevant schema type, and the most important internal links. Browsers ignore it because JavaScript is enabled. Non-rendering crawlers — GPTBot, ClaudeBot, PerplexityBot — read it as the page's primary content and use those tokens to predict what the page is about.

Why it works at the model level: LLM retrieval is dominated by the first few hundred tokens the crawler sees under each heading. When a page is a JavaScript shell, the first tokens are usually navigation chrome and a "loading" placeholder. Replacing that with a curated summary, schema, and internal links means the model's very first impression of the page is an authoritative, on-topic, entity-rich block — exactly the shape it is trained to cite.

Critical implementation note for AEM teams: author the noscript summary as an Adobe Experience Manager Content Fragment, not as a hardcoded HTL template snippet. Treat it as a first-class authoring surface, the same way you treat the meta description. The Content Fragment model gives you typed fields, version history, and translation workflows out of the box. If editors can curate it through the AEM authoring UI, it stays accurate. If it lives in code, it goes stale within a quarter.

Actionable steps:

Add a <noscript> block to your base AEM Sites page template (HTL).
Inside it, render: H1, 2–3 sentence abstract, JSON-LD for the page type, and 3–5 high-value internal links.
Model the abstract as a Content Fragment field so authors can curate it per page.
Re-run the AI Content Visibility Checker and re-record the Citation Readability Score. Expect a meaningful jump on this single change alone.

Step 3: Serve user-agent–specific pages through the CDN

For higher-value templates, take it a step further and use CDN edge logic to branch on the request user agent. Known AI crawlers are routed to a fully pre-rendered, semantically rich variant of the page; humans continue to receive the SPA. On Adobe Experience Manager as a Cloud Service Managed CDN this is implemented as Fastly VCL through the Cloud Manager Edge configuration; on AWS CloudFront it is Lambda@Edge or CloudFront Functions; on Akamai it is EdgeWorkers; on Cloudflare it is a Worker. The pattern is the same.

Why it works: the noscript pattern bolts a summary onto the existing page. User-agent routing replaces the entire response for AI crawlers with a server-rendered HTML document — full body copy, full schema, full link graph. The model gets a complete page instead of a summary, which lets it answer multi-part questions about your content rather than just classifying the topic.

A note on cloaking, because every reviewer will ask. Google's guidelines distinguish "different content for different users" (cloaking, against guidelines) from "different rendering of the same content" (acceptable). Keep both variants semantically equivalent. The HTML variant should say the same things as the SPA — it just says them in a form a non-rendering crawler can read. Document this internally and you stay on the right side of the line.

Actionable steps:

Maintain a vetted user-agent allowlist (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Meta-ExternalAgent, Bytespider, etc.). Do not branch on substring "bot" alone.
At the CDN edge, route allowlisted user agents to a pre-rendered HTML variant of the page.
Log every branched response so you can monitor for spoofed user agents and audit cloaking risk.
Keep both variants semantically equivalent. Run a periodic diff so the SPA does not drift from the HTML variant.

Step 4: FAQ blocks targeted at real prompts, with Schema.org FAQPage JSON-LD

FAQs are the highest-leverage content shape for LLM citation, and most teams build them wrong. The mistake is using SEO keyword tools to source the questions. LLM retrieval is driven by cosine similarity between the user's prompt embedding and on-page text. The lever is matching prompt phrasing — full, conversational, multi-clause sentences — not keyword phrasing.

Source your FAQs from real prompts:

Mine your support tickets and sales call transcripts for verbatim customer questions.
Search Perplexity and ChatGPT shared chats for prompts in your category.
Best of all: ask your sales engineers what the top three questions on every prospect call are. Those are your highest-intent prompts.

Then write each answer in the inverted-pyramid shape that LLMs reward. The Princeton "Generative Engine Optimization" paper (Aggarwal et al., arXiv 2311.09735) tested nine content modifications across thousands of queries on a generative engine and reported visibility lifts of up to 40% for three specific changes: citing sources, adding direct quotations, and adding statistics. Apply all three to every FAQ answer.

And, critically, emit the Schema.org FAQPage JSON-LD server-side. Google deprecated the FAQ rich result in 2023, but LLMs still parse the markup. It tells the model "this is a question and this is its answer," which is the exact retrieval shape it is trying to match against the user's prompt. In Adobe Experience Manager, this is best implemented as a Core Components-style FAQ component backed by a Content Fragment model — typed Question and Answer fields that the HTL template renders into both visible markup and the JSON-LD block in a single pass.

Below is a single FAQ pair as it should appear in your AEM page output. Note the inverted pyramid in the answer (direct answer first, then context, then statistic, then citation), and the server-rendered JSON-LD that mirrors the visible Q&A.

<span itemprop="name">Why is my AEM SPA invisible to ChatGPT?</span>

</h2>

<p>Your Adobe Experience Manager SPA is invisible to ChatGPT because the

ChatGPT crawler does not execute JavaScript. It fetches the initial HTML

response and parses only what is immediately present. Any content rendered

after hydration — including most AEM SPA Editor templates and any

client-injected JSON-LD — is never seen by the model.</p>

<p>Vercel analyzed more than half a billion GPTBot requests in 2024 and

found zero evidence of JavaScript execution. The fix is to ensure your

page returns its primary content in the initial HTML response, either by

adding a server-rendered noscript summary, by routing AI crawlers to a

pre-rendered variant at the CDN edge, or by moving to Adobe Experience

Manager Edge Delivery Services.</p>

<p>Source: Vercel, "The rise of the AI crawler" (2024).</p>

</div>

</section>

{

"@context": "https://schema.org",

"@type": "FAQPage",

"mainEntity": [{

"@type": "Question",

"name": "Why is my AEM SPA invisible to ChatGPT?",

"acceptedAnswer": {

"@type": "Answer",

"text": "Your Adobe Experience Manager SPA is invisible to ChatGPT because the ChatGPT crawler does not execute JavaScript. Vercel analyzed more than half a billion GPTBot requests in 2024 and found zero evidence of JavaScript execution. The fix is to return primary content in the initial HTML response, via a noscript summary, CDN user-agent routing, or Adobe Experience Manager Edge Delivery Services."

}

}]

}

</script>

Three things this example does that most FAQ implementations miss: the answer leads with the direct answer (not background), it includes a verifiable statistic with a named source, and the JSON-LD is server-side rendered so non-JavaScript crawlers actually see it.

Actionable steps:

Build a reusable AEM FAQ component that emits FAQPage JSON-LD server-side from a Content Fragment model.
Source 5–10 FAQs per high-value page from real customer prompts.
Write every answer in the order: direct answer → context → statistic → cited source.
Validate the JSON-LD with the Schema.org Validator before shipping.

Step 5: Apply the Princeton GEO triad to every page body

The Princeton GEO paper is the only peer-reviewed study of what actually moves the needle on generative engine visibility. Out of nine content modifications tested, three drove the headline 40% lift and the rest were noise: cite sources, add quotations, add statistics. Keyword stuffing and fluency optimization had no measurable effect.

Why these three: LLMs preferentially cite passages that look verifiable. Numerals, named experts in quotes, and inline source citations are the textual fingerprints of trustworthy content in the training data. When the model is choosing between five candidate passages to cite, the one with a stat and a quote wins.

Apply the triad as an editorial standard, not a checklist. In an Adobe Experience Manager workflow, this is a content governance change: update your editorial guidelines, add the triad to your component-level help text in the authoring UI, and bake it into the review checklist for every Content Fragment.

Actionable steps:

Every section claim should cite a primary source by name and link.
Every key argument should include a direct quotation from a named expert.
Every assertion should be backed by a numeral (a percentage, a count, a measurement, a date).

Step 6: Entity grounding via Schema.org sameAs

LLMs are trained on Wikipedia and Wikidata. The fastest way to tell a model "this brand is the same brand you already know about" is to emit Schema.org Organization and Person types with a sameAs array linking your canonical entities to their corresponding profiles on the public knowledge graph. This is entity grounding, and it is one of the most underused signals in enterprise GEO.

Why it works: when the model encounters your brand on a page, the sameAs links let it resolve the mention to a known node in its parametric memory instead of treating it as a new, ungrounded string. Resolved entities get cited; unresolved strings get dropped.

In Adobe Experience Manager, the natural place for this is the site-wide page template (for Organization) and the author profile component (for Person). Both can be backed by a single Content Fragment so the entity data is authored once and reused across every page.

Actionable steps:

Author Organization and Person Content Fragments containing canonical name, description, and a sameAs URL list.
Render Schema.org Organization JSON-LD on every page from the site-wide template.
Render Schema.org Person JSON-LD on every author byline from the author Content Fragment.

Step 7: Re-measure and prove the lift

Close the loop. After each fix above, re-run the AI Content Visibility Checker on the same 10 pages you baselined in Step 1 and record the new Citation Readability Score. The goal is not a perfect score on every page — it is a measurable, repeatable improvement after each change, so you can prove to your stakeholders that GEO investment moves the number that matters.

Build a simple before/after table — page URL, baseline score, score after each fix — and share it with your steering committee. It is the artifact that converts skeptics and unlocks the budget for Step 8.

Step 8: Operationalise and scale with Adobe LLM Optimizer

Once your Citation Readability Scores are healthy across your top templates, you have proven the patterns. The next step is operationalizing them at scale across an enterprise Adobe Experience Manager estate. That is what Adobe LLM Optimizer (announced generally available October 14, 2025) is built for.

Adobe LLM Optimizer is the enterprise control plane for Generative Engine Optimization. It complements the manual playbook in this article in three ways. First, it audits an entire AEM estate against the same readiness signals the Chrome extension checks on a single page — at scale, automatically, on a continuous schedule. Second, its Optimize at Edge capability serves AI-friendly modifications to LLM user agents at the CDN layer, with no authoring changes required in the origin CMS, across Adobe Experience Manager as a Cloud Service Managed CDN, AWS CloudFront, and other major CDNs. Third, it monitors AI agent traffic continuously, so you can read exactly which AI surfaces are pulling content from your CloudFront edge, which pages are being cited, and how brand visibility is trending against competitors across ChatGPT, Perplexity, Google AI Overviews, and others.

In other words: the Chrome extension tells you whether one page is ready. Adobe LLM Optimizer tells you whether your business is winning the AI answer market — and gives you the levers to shift that number. Adobe Experience Manager Sites customers have a native integration path; the platform is also available standalone.

Actionable steps:

Validate the patterns from Steps 2–6 on a representative sample of pages and capture the Citation Readability Score lift.
Engage your Adobe team to scope an Adobe LLM Optimizer rollout against your AEM estate.
Start with continuous monitoring on your top revenue templates; expand Optimize at Edge once the operating model is in place.
Use the platform's competitive benchmarking to set quarterly targets for AI citation share by category.
A note for Adobe Experience Manager Edge Delivery Services teams

If you are on Adobe Experience Manager Edge Delivery Services (aem.live), you have a head start. Adobe's official documentation is explicit: "Edge Delivery websites are search engines optimized (SEO) and generative engine optimized (GEO) for LLMs." Edge Delivery Services renders plain HTML at the edge with no JavaScript hydration required for content, which means most of the rendering gap that traditional AEM SPAs suffer from simply does not exist. Steps 2 and 3 (noscript and CDN user-agent routing) become unnecessary on EDS. Steps 4 through 6 — FAQ schema, the Princeton triad, entity grounding — are still the work, and Adobe LLM Optimizer integrates with EDS the same way it integrates with AEM Cloud Service Managed CDN.

Final thoughts

The shift from traditional search to generative answer engines is not a future scenario. It is the current state for a quarter of Google queries and a rapidly growing share of high-intent research. If your Adobe Experience Manager Sites property is an SPA, the safe assumption is that LLMs see a blank page until you prove otherwise. The good news is that the fix does not require a re-platform. Measure with the AI Content Visibility Checker. Ship the noscript pattern in a sprint. Route AI crawlers at the CDN for high-value templates. Build FAQ blocks from real prompts with server-rendered FAQPage schema. Apply the Princeton triad to every body section. Ground your entities with sameAs. Re-measure after each change so you can prove the lift. Then bring in Adobe LLM Optimizer to operationalize it across your estate.

Early movers in GEO compound the same way early movers in SEO did fifteen years ago — visibility today becomes citation authority tomorrow. The Adobe Experience Manager teams that start measuring this quarter will be the ones that own their category in AI answers next year.

Key takeaways

AI crawlers do not execute JavaScript — Vercel found zero JavaScript execution across half a billion GPTBot fetches. SPA-based Adobe Experience Manager Sites properties are usually invisible to GPTBot, ClaudeBot, and PerplexityBot.

Measure first with the AI Content Visibility Checker, powered by Adobe LLM Optimizer. The Citation Readability Score is your baseline and your lift metric.
The fastest single fix is a server-rendered <noscript> block containing H1, abstract, JSON-LD, and key links — authored as an AEM Content Fragment, not hardcoded.
CDN user-agent routing closes the gap for high-value templates without re-platforming the SPA. Keep variants semantically equivalent to stay on the right side of cloaking guidelines.
FAQ blocks targeted at real customer prompts, written in the inverted pyramid, with server-rendered Schema.org FAQPage JSON-LD, are the highest-leverage citation surface.
The Princeton GEO paper measured up to +40% visibility from three modifications: cite sources, add quotations, add statistics. Apply them as editorial standards in your AEM authoring workflow.
Ground entities with Schema.org Organization and Person types using sameAs. Resolved entities get cited; unresolved strings get dropped.
Re-measure after every fix. The before/after Citation Readability Score is the artifact that converts skeptics.
Once your foundations are healthy, scale with Adobe LLM Optimizer. The Chrome extension tells you whether one page is ready; Adobe LLM Optimizer tells you whether your business is winning the AI answer market.
Adobe Experience Manager Edge Delivery Services teams skip Steps 2 and 3 — EDS is GEO-optimized for LLMs out of the box per Adobe's official documentation.

Additional resources

AI Content Visibility Checker, powered by Adobe LLM Optimizer (Chrome Web Store)
Introducing the LLM Optimizer Chrome extension (Adobe blog)
Adobe LLM Optimizer product page
Optimize at Edge documentation
Adobe newsroom — Adobe LLM Optimizer general availability announcement (October 14, 2025)
Schema.org FAQPage
Schema.org Organization with sameAs

style

article-content-section