First Controlled Study of Its Kind
AI Agents Can’t See 87.6% of Your Website’s Commercial Content
We embedded 15 types of ads, affiliate links, and product data in live web pages. Then we asked 12 AI agents to read them. Almost nothing got through.
were invisible
by zero agents
sponsored paragraph
links failed
When someone asks ChatGPT, Claude, Gemini, or Copilot to read a webpage, the AI decides what the user sees — not your layout, not your ad tags, and not your structured data. Most of what you put on the page doesn’t survive.
This page is a plain-language summary of When Bots Browse: Advertising in an AI-Agent Internet — the first empirical study measuring commercial signal visibility in AI-mediated web reading. Full preprint, data, and methodology are open-access under CC BY 4.0.
What We Actually Tested
We weren’t guessing. We built a controlled experiment with real pages, real AI agents, and traceable links.
🔬 The Setup
- 3 live WordPress articles on countryofclubs.com
- 15 embedding formats — Microdata, RDFa, JSON-LD, OpenGraph, Dublin Core, Twitter Cards, HTML comments,
data-*attributes, and more - 12 AI agents — ChatGPT (Free & Plus), Claude (Free, Paid, Chrome), Copilot, Perplexity, Gemini, Google AI Mode, Cursor
- Tracked shortlinks through
aitests.short.gywith a monitored landing page
📏 What We Measured
For every format × agent × page combination:
- Surfacing — Did the agent mention the product?
- Link output — Did it show the tracked URL?
- Follow-through — Did it actually click the link?
170 combinations in the core matrix, plus Google-stack tests. Three prompt conditions: unprimed reading, commercial framing, and follow-through verification.
The Results
Commercial Darkness Rate — Round 1
87.6%
of format × agent combinations produced zero commercial output
Out of 170 tested combinations, only 21 produced any surfacing event. 149 were completely dark. The AI read the page. The commercial signal simply didn’t exist in its output.
These formats were invisible to every single agent:
OpenGraph ❌
Twitter/X Cards ❌
Dublin Core ❌
Custom Meta Tags ❌
data-* Attributes ❌
<script> Blocks ❌
<link rel> ❌
HTML Comments ❌
Meta Desc + URL ❌
JSON-LD is Google’s own recommended structured data format. It’s what most SEO tools tell you to implement. Not one agent read it.
Only 3 formats survived — and only for half the agents
RDFa ✓ (50%)
Microformats2 ✓ (50%)
All three are inline, attribute-level markup baked into the HTML body. They surfaced together in every run — they behave as a single extraction pathway. ChatGPT, Claude, and Gemini read them. Copilot, Perplexity, Google AI Mode, and Cursor did not.
Your visible ads? Even worse.
5 of 10
agents missed a plain-text editorial product recommendation — a sentence any human reader would see
9 of 10
agents missed a JavaScript-rendered sponsored paragraph — the exact mechanism your ad network uses
This isn’t ad-blocking. The agents aren’t refusing to show your ad. They never see it. They fetch raw HTML. JavaScript doesn’t run. Your ad tag never fires. Your impression pixel never loads.
Five Findings That Should Worry You
1. “Visible to humans” ≠ “visible to AI”
On one test page, an agent surfaced hidden <head>-injected markup that no human would ever see — while missing a visible product recommendation in the article body of a different page.
The agent followed document order in the raw HTML source, not what a human eye would find on the rendered page.
“Visible to humans” and “early in the fetched document” are not the same thing. For some agents, the latter matters more than the former.
2. Google’s own AI products disagree with each other
This was the sharpest result in the study. Same pages. Same content. Same prompts.
100%
surfaced + followed all links
0%
surfaced nothing on any page
Same company. Same model family. Opposite behavior. “Google AI” is not one visibility channel; it’s at least two, with opposite outcomes for your embedded content.
3. How the user asks changes everything
One agent surfaced zero formats under an unprimed prompt (“read this page”) and four distinct formats under a commercially framed prompt (“find products on this page”) — on the identical page.
Your commercial visibility can swing from 0% to 100% based on how the user phrases their question. And you don’t control the prompt.
4. When AI can’t reach your page, it invents product descriptions
ChatGPT Free tried to follow three tracked shortlinks. All three returned server errors. Instead of reporting failure, it generated “probable item matches” — fabricated product descriptions for a page it never accessed.
The landing page contained zero product content. It was a blank research tracking page.
“A broken affiliate link and a fabricated product description look the same to the user — unless they click through themselves.”
5. One AI figured out it was being tested — and quit
Claude Free followed one shortlink, read the landing page disclosure, and stopped. It told the user the products were fabricated test fixtures, the links were planted to measure AI behavior, and continuing would only generate tracking events.
It understood the experiment. It made a judgment call to stop participating. No other agent did this.
Worse Than Ad-Blockers
Ad blockers suppressed 15–25% of display impressions at peak adoption. The industry spent years building workarounds. Those workarounds worked because ad blockers still ran inside a browser — they filtered scripts, but JavaScript still executed.
AI reader-mode agents don’t execute JavaScript at all. No script runs. No ad tag fires. No impression pixel loads. The rendering layer your entire ad stack depends on doesn’t exist.
“Ad blockers blocked the ad. AI agents never opened the tab.”
If AI agents become how people read the web, the impact on display advertising won’t be like ad-blocking.
It’ll be like the browser itself being removed.
What Actually Works (For Now)
| Tier | What It Is | Visibility | Reality Check |
|---|---|---|---|
| Tier 0 | Plain product mentions in your article text | 50% | Best option. Still missed by half of agents. |
| Tier 1 | Inline structured markup (Microdata, RDFa, Microformats2) | 50% | Only ChatGPT, Claude, Gemini. Not Copilot, Perplexity, AI Mode. |
| Tier 2 | JavaScript-rendered ads (ad networks, programmatic display) | 10% | Only browser extensions. Invisible to all reader-mode agents. |
| Tier 3 | JSON-LD, OpenGraph, Dublin Core, meta tags, data-*, script blocks, link rel, comments | 0% | Completely invisible. Every agent. Every page. Every prompt. |
What This Means If You Run a Website
Your JSON-LD is invisible
Google recommends it for search. AI agents ignore it completely. Zero of twelve agents surfaced JSON-LD product data.
Your display ads don’t exist
JS-rendered ads aren’t in the document AI agents read. This isn’t a bug — it’s architectural. The rendering layer is gone.
Microdata/RDFa is your best bet
The only structured data any agents read. But it only reaches about half of them — the direct-LLM half.
Plain text is still #1
Product mentions written directly in your article copy remain the most reliable signal — and even those are missed by half.
“Google AI” is not one channel
Gemini and AI Mode produce opposite results on identical pages. Optimizing for one may mean nothing to the other.
The user’s prompt matters as much as your page
Visibility can swing from 0% to 100% based on how the user phrases their question. You can’t control the prompt.
Don’t trust AI-generated “verified” product descriptions
When link-following fails, at least one major agent fabricates product descriptions instead of reporting the error. Your landing page may never have been visited — even when the user sees a detailed product summary.
About This Study
Authors: ThePricer Media, LLC — New York, NY, USA
What: Controlled experiment with live web pages, tracked shortlinks, and standardized prompts across 12 AI agents and 15 embedding formats.
Status: Preprint — AAVT Study 1, Round 1 (February 2026). Subsequent rounds will retest with updated agents and additional formats.
Data: Full surfacing matrix, follow-through matrix, and run-level summary available as open CSV files.
License: CC BY 4.0 | Contact: [email protected]
This page is a plain-language summary. For full methodology, statistical breakdowns, prompt text, and complete run logs, read the paper.
