Insights on AI advertising, publisher monetisation, and the future of content discovery.
Query fan-out is the technique behind Google's AI Mode and other AI search engines: a single question is silently broken into dozens - or, in deep research modes, hundreds - of background searches, each pulling from different sources. For publishers it multiplies how often your content is read while collapsing the clicks you receive towards zero.
ChatGPT Search is the live web retrieval mode inside ChatGPT that answers a question with a synthesised response and a short list of cited links, instead of returning a page of results. For publishers it is the single largest source of AI referral traffic and the AI surface most likely to read their content, yet it returns a referral for only a fraction of what it crawls, and OpenAI has confirmed it will not share its new advertising revenue with publishers.
The crawl-to-refer ratio counts how many pages an AI platform crawls for every visitor it sends back. Cloudflare data puts the imbalance in the tens of thousands to one for the heaviest operators, against single digits for traditional search. It is the cleanest single number proving that AI reads publisher content far more than it returns traffic.
NLWeb is an open Microsoft project that turns any website into a conversational AI app and a Model Context Protocol server, so people and agents can query your content in plain language. It runs on data publishers already publish, Schema.org and RSS, and lets sites join the agentic web on their own terms. What it does not settle is how a publisher gets paid when the agent takes the answer and the reader never lands.
Schema markup helps AI systems read and understand your content, but the strongest 2026 evidence shows it does not directly cause AI citations. Authority, content quality and clean retrievable text are the real drivers, and thin or generic schema can even reduce visibility. Treat structured data as comprehension hygiene, not a citation lever.
Grok is xAI's AI assistant, built natively into X and available at grok.com, and by early 2026 it had become the third most-used chatbot in the United States. For publishers it is one of the most extractive AI surfaces in the market: it reads the open web and live X posts to answer questions, returns almost no referral traffic, and runs a retrieval crawler that routinely ignores robots.txt by presenting itself as an ordinary human visitor.
Meta AI is the most widely used consumer AI assistant in the world, built into WhatsApp, Instagram, Facebook and Messenger and reaching over a billion people a month. For publishers it is close to a pure extraction surface: it trains on web content and answers questions inside Meta's apps, yet returns almost no referral traffic. Its new Facebook AI Mode goes further, answering searches from Meta's own social graph rather than the open web.
Microsoft Copilot is an AI assistant built on the Bing index, now embedded across Windows, Microsoft 365, and Bing search itself. With 420 million monthly active users as of Q1 2026, it is one of the largest AI surfaces consuming publisher content - yet one documented case found 48,000 Copilot citations generated just 14 clicks. Here is what publishers need to understand about how it works, what the traffic data actually shows, and what they can do.
Whether AI systems can access your subscriber content depends almost entirely on how your paywall is built. Publishers using client-side JavaScript overlay paywalls are delivering the full article text to any requesting client before the subscription prompt fires - because the HTML leaves the server before JavaScript runs. Hard, server-side authentication is the only architecture that reliably stops both traditional AI crawlers and the newer generation of AI browsers.
Siri AI is Apple's rebuilt AI assistant, announced at WWDC 2026 on 8 June and running on Google's Gemini models. It can answer virtually any web question before a user opens a browser, adding a third major AI answer surface - alongside Google AI Mode and ChatGPT - that publishers supply content for but cannot yet monetise.
Measuring the return on GEO investment is genuinely harder than measuring SEO, because most AI influence is zero-click and never enters your web analytics. A proper GEO ROI framework works across four layers - citation visibility, branded search lift, AI-attributed traffic, and downstream revenue - and requires different tools from GA4 to prove the number.
AI deep research tools - from OpenAI's Deep Research to Google Gemini Deep Research and Perplexity Deep Research - consume publisher content at a scale that dwarfs ordinary chatbot searches, while sending almost no referral traffic in return. A single query can draw on dozens to hundreds of source pages; publishers whose content powers those reports receive neither the visit nor an ad impression.
Advertising inside LLM responses now takes three distinct forms: platform-native ads sold by the AI company, product-feed placements served inside shopping queries, and content-layer ads placed in the publisher content an assistant reads. Here is how each works, who gets paid, and what the 2026 landscape actually looks like.
Web Bot Auth is an emerging IETF standard that lets publishers cryptographically verify which company an AI crawler actually belongs to - rather than trusting an easily-spoofed user-agent string. Google's AI-browsing agent already signs requests; OpenAI's ChatGPT agent does too. For publishers, it is the first technically enforceable answer to the question: is this really who it claims to be?
Cloudflare's Content Signals Policy adds machine-readable directives to robots.txt that let publishers say whether their content may be used for search, AI answers, or AI training. It is worth adopting as a statement of intent and a legal hook, but it controls preferences, not access, and earns nothing from the AI traffic it permits.
A zero-click search is one that ends without the user clicking through to any website, because the answer is delivered on the results page itself. In early 2026 around two thirds of Google searches end this way, and the rate climbs past 80% when an AI Overview appears. The visit that used to fund the page is increasingly never made.
Google AI Mode is a dedicated, Gemini-powered conversational search experience that answers a query inside a chat thread instead of returning a page of blue links. It passed a billion monthly users within a year of launch, and because people can research, follow up and act without leaving Google, it removes the click publishers depend on. The content still feeds the answer; the visit, the ad impression and the affiliate link increasingly do not happen.
AI browsers such as ChatGPT Atlas and Perplexity Comet put an agent between the reader and the page, summarising content and completing tasks so people click through far less often. For publishers that means fewer page views, attribution that collapses into 'direct', and a fast-growing slice of audience that conventional analytics cannot see. The browsers that actually pay, like Comet Plus, are still the exception.
AdCP, the Ad Context Protocol, is an open standard that lets AI agents from advertisers, publishers and ad tech platforms negotiate and transact media in a shared language. Launched in late 2025 and built on Anthropic's Model Context Protocol, it is being called the OpenRTB of the agentic era. It standardises how inventory is bought - not whether an AI answer that shows no ad ever pays the publisher behind it.
RSL is an open standard that lets publishers attach machine-readable licensing terms - including pay-per-crawl and pay-per-inference fees - to the content AI crawlers take. More than 1,500 organisations now back it. No major AI company has yet agreed to honour it, and that gap is the whole story.
llms.txt is a plain Markdown file that hands AI models a clean, curated summary of your site so they can read it without wading through HTML. It is genuinely useful for documentation and in-product AI retrieval, but most major AI crawlers still ignore it and it does nothing to get you paid. Publishers should treat it as a comprehension aid, not a monetisation or visibility strategy.
robots.txt cannot reliably stop AI crawlers, because it is a voluntary instruction, not an enforcement mechanism. Here is what it can and cannot do, and where real enforcement and monetisation actually happen.
Most publishers should do both: block the crawlers that take value and give nothing back, and monetise the ones that represent a real audience. Here is a decision framework, crawler by crawler and page by page.
Information Agents are one layer of a complete agentic commerce stack Google announced at I/O 2026. Discovery, answer, basket, payment, execution - all Google owned, all terminating inside Google's environment. Here's where the publisher position sits, and why the contractual layer is the next twelve months of monetisation.
The Google-Agent user agent string can be spoofed. The IP range, reverse DNS, and Web Bot Auth signatures can't. Three verification methods, in increasing order of reliability, and why the cryptographic layer changes the publisher position more than the agent product launch itself.
Publisher training bot policies cover the first AI dependency. They don't cover the second. Here's the four week plan we're recommending to our publisher partners - and why the Q4 commercial conversations are being framed this summer.
At I/O 2026, Google announced Information Agents - persistent AI processes that scan the live web on a user's behalf. It's Live Search Agent traffic, productised, shipping this summer.
Automated traffic is now more than half of the web, AI crawlers make tens of billions of requests a day, and publishers lose an estimated $2 billion a year to AI bots. Here are the key AI bot traffic statistics for 2026, with sources.
AI assistants cite the sources they retrieve, trust, and can extract cleanly, favouring content that is factual, structured, attributed, and recent. Here is how the decision works, why it differs by assistant, and how to become a cited source.
Advertising in AI answers can be brand safe, but the risks differ from display and depend on the model you use. Here is what changes in a generated answer, and which approach gives advertisers and publishers the most control.
The AI bots visiting your site fall into three groups: training crawlers, retrieval crawlers, and live agents. Here is a 2026 reference to the names you will see, from GPTBot to ChatGPT-User, and what to do about each.
Agentic commerce is AI agents carrying out commercial tasks for people, from research to checkout, often without a website visit. Publishers are still the source agents read. Here is what that means for your position in the loop and your revenue.
SEO optimises to rank a link that a user clicks. GEO optimises to be the source an AI assistant cites in its answer. Here is where the two overlap, where they differ, and what GEO asks for that SEO never did.
AI bot traffic is worth more than most publishers realise, because the value is in the intent behind the read, not the visit count. Most sites earn nothing from it today. Here is what determines its value and how to price your own.
AI bots do not appear in Google Analytics because they never run the JavaScript that records a visit. The read still happened, and you still earned nothing from it. Here is why the blind spot exists and how to actually see your AI traffic.
Brand share of voice in AI answers is how often your brand appears in AI responses for a set of questions, measured per assistant. Here is how to calculate it, why ChatGPT and Perplexity give different results, and how to turn the number into action.
CDN-layer monetisation earns revenue from traffic at the network edge, before any JavaScript runs. It matters because AI agents never run the browser code traditional monetisation depends on. Here is how it works and why the edge is the only layer that can monetise AI traffic.
Generative engine optimisation is the practice of optimising content so AI assistants cite it as a trusted source. GEO competes for citations, not rankings. Here is how it differs from SEO and what makes content more likely to be quoted.
To be mentioned in AI answers, your brand has to be in the content assistants read. There are two routes, organic citation and paid placement at the moment of retrieval. Here is how each works and how to measure whether it is working.
A Live Search Agent is a bot an AI assistant sends to read a specific page in real time to answer a user's question. It is different from a training crawler, it never runs JavaScript, and it is invisible to standard analytics. Here is why it matters to publishers.
AI Overviews answer queries in place, and the referral traffic they remove is not coming back in its old form. Here is what the data shows, why the loss is structural, and the three-part response that protects revenue and opens a new line from AI agent traffic.
TollBit and Cloudflare pay per crawl charge AI bots for access. Ad injection earns from the read itself. Here is how the three models differ, which traffic each one captures, and why most publishers will end up using more than one.
AI licensing deals pay a small number of very large publishers well, and almost everyone else little or nothing. Here is why the headline agreements stay out of reach for most publishers, and what the uncompensated majority can do instead.
There are three ways to monetise AI bot traffic: license your content, charge crawlers for access, or inject paid contextual brand mentions into the pages AI agents read. Licensing suits a handful of large publishers. Here is how each model works, and which one captures the Live Search Agent traffic the others miss.
We launched a free domain audit this week. Before you request one, here's exactly what the report shows: what's real data, what's modelled, and why we're being transparent about the difference.
Google quietly added a new bot that bypasses robots.txt by design. And 23 major publishers are blocking the Wayback Machine, while their own journalists depend on it. Two developments every publisher commercial team needs to understand.
Your domain audit includes a brand share of voice table. Here's what the multiplier actually means, how to interpret high and low scores, and how to use the data in your next advertiser conversation.
Publishers are making editorial and commercial decisions on incomplete data. The fastest-growing segment of their audience doesn't appear in Google Analytics.
$15.02 average CPM from AI agent monetisation in live publisher accounts. Publishers with that same traffic on their sites are currently earning $0 from it.
ChatGPT, Claude, and Perplexity are now the first stop for millions of purchasing decisions. Here's what that means for publishers and brands.
AI retrieval events aren't bot visits. They're audience data: high-intent signals connected to real purchase decisions. The publishers who start reading them will understand their readership better than anyone.
A practical guide for content publishers ready to turn AI bot traffic into a new revenue stream - from integration to optimisation.
To understand the cost, you need to understand what publishers have spent two decades building on top of HTML.
Early data from Blankspace campaigns suggests AI-cited brand mentions drive purchase intent at rates significantly above traditional digital channels.