Inside the live search agent identity layer.

Part 3 of 4: the technical breakdown publishers and their WAF teams need.

TL;DR

The Google-Agent user agent string can be spoofed. Don't make policy decisions on the string alone.
Three verification methods exist, in increasing order of reliability: user agent string, IP and reverse DNS, Web Bot Auth.
Web Bot Auth is the substantive shift. Cryptographic, signed at the request level, already in production at Cloudflare, Akamai, and AWS WAF.
For most publishers, the implementation is "your WAF vendor is already doing it; configure it to surface verified agents as a distinct traffic class."
Verification is what makes the week three policy in the playbook meaningful. Without it, every contractual conversation in week four starts on weaker ground.

In part 2 we said the publisher policy decision in week three has three answers: verify and allow at full content, serve a degraded response, or block at the WAF.

All three answers depend on being able to verify in the first place.

This is the piece on how.

The verification problem is older than Google-Agent.

Anyone can put Googlebot in a user agent string. Bad actors have been doing it for fifteen years. The user agent is a request header. It's set by the client. Trusting it without verification is a dangerous game.

What's changed at I/O 2026 is the solution.

Three methods to verify a live search agent, in increasing order of reliability.

Method 1: The user agent string

The first line of defence is also the weakest.

Google publishes two Google-Agent strings, one mobile and one desktop. Both follow the standard Googlebot structure but identify themselves clearly:

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko; compatible; Google-Agent; +https://developers.google.com/crawling/docs/crawlers-fetchers/google-agent) Chrome/W.X.Y.Z Safari/537.36

The W.X.Y.Z is a Chrome version placeholder.

That's it.

Use the user agent string for one thing only: preliminary filtering. Pulling the rows out of your access logs that claim to be Google-Agent so you can verify them properly downstream.

Never make a policy decision based on the string alone.

Google says so in its own documentation*. So does every honest WAF vendor. The string is a flag, not a credential.

Method 2: IP range and reverse DNS

The historical method. Slow, but it works.

Google publishes the IP ranges Google-Agent uses in a dedicated JSON file: user-triggered-agents.json*. This is separate from the existing user-triggered-fetchers.json and user-triggered-fetchers-google.json, which is a deliberate choice. Google has carved out agentic traffic as its own infrastructure category.

Step one: fetch the JSON, build it into your firewall or log pipeline, and check that the source IP of any request claiming to be Google-Agent matches one of the documented ranges.

Step two, if you want full assurance: reverse DNS lookup. The Google-Agent reverse DNS pattern is one of two:

***-***-***-***.gae.googleusercontent.com
google-proxy-***-***-***-***.google.com

The first pattern is Google-owned. The second is the user-owned proxy variant. Both are legitimate.

Step three, to defeat reverse DNS spoofing: forward DNS lookup on the result. If the forward lookup resolves back to the original IP, the request is verified.

This is the two-step DNS process Google documents for all of its crawlers, going back to Googlebot.

It works. It's also expensive. Reverse DNS is slow, adds latency to every request, and doesn't scale well at high traffic volumes. Most publishers do this sampling-based rather than per-request, accepting that a small percentage of spoofed requests get through.

Which is fine until it isn't.

Method 3: Web Bot Auth

The substantive shift.

Web Bot Auth is an IETF draft standard - draft-meunier-web-bot-auth-architecture - that brings cryptographic identity to bot traffic. Instead of trusting a user agent string or running expensive DNS lookups, the agent signs every HTTP request with a private key. The server verifies the signature against the agent's published public key, in milliseconds, per request.

Google is "experimenting" with Web Bot Auth, in its own framing, and uses the identity https://agent.bot.goog for Google-Agent.

The architecture, in one paragraph:

The agent publishes its public key to a well-known directory. Each outbound request gets a signed HTTP Message Signature header (the underlying RFC is 9421) tied to the request method, URL, and headers. The server fetches the public key once, caches it, and verifies the signature on every request. If the signature is valid and the key matches the directory, the agent is who it says it is. If not, the request is rejected.

What this means in practice:

Verification happens at the request layer, not the network layer. No DNS round trips. No IP allowlisting.
The verification is cryptographic. The user agent string becomes informational; the signature is the credential.
Identity becomes portable across IPs. Google-Agent traffic from a Google-owned IP and from a user-owned proxy both verify the same way, as long as both sign with the documented key.

Cloudflare, Akamai, and AWS WAF have already implemented Web Bot Auth verification*. So have Amazon (in AgentCore Browser) and Stytch.

For most publishers, the implementation isn't a build. It's a configuration:

Confirm your WAF supports Web Bot Auth. If you're on Cloudflare, Akamai, AWS, or a similar tier-one provider, it already does.
Enable verification.
Configure your WAF to tag verified Google-Agent requests as a distinct traffic class.
Route those tagged requests according to the policy you set in week three of the playbook.

That's the work.

Why verification matters commercially

Verification is what makes the week-four conversation possible.

If you can't verify which agentic traffic is real, you can't measure it. If you can't measure it, you can't price it. If you can't price it, the commercial conversation in Q4 is being had by someone else about your content.

A verified-agent traffic class in your logs is also the basis for the audit trail.

When the contractual conversation happens - and it will - the publisher with verifiable, signed, request-level evidence of how Google-Agent has been interacting with their content for the previous quarter is in a different room from the publisher without it.

This is why Web Bot Auth matters more than the underlying agent product launch.

The product launch is one company shipping one feature.

The cryptographic identity layer is the standard the entire agentic web is going to be built on. Google is one of multiple identities that will publish keys at *.bot.* domains over the next twelve months. OpenAI will. Anthropic will. So will every agent product that wants to be allowed through publisher WAFs at scale.

The verification work you do now compounds.

It also stops being a Google story. It becomes a publisher infrastructure story, where the publishers with mature verification capability sit upstream of every agentic surface the open web is being rebuilt around.

Where this leaves the series

Part 4 is the synthesis.

The structural shift to agentic commerce, why Information Agents are the front-end of a much bigger Google stack, and what the publisher position looks like once the verification and instrumentation work in parts 2 and 3 is done.

Part 4 in this series: the structural shift. Why Information Agents are the surface, not the system, and what comes next.

Sources: *https://developers.google.com/crawling/docs/crawlers-fetchers/google-user-triggered-fetchers *https://developers.google.com/static/crawling/ipranges/user-triggered-agents.json *https://developers.cloudflare.com/bots/reference/bot-verification/web-bot-auth/ *https://datatracker.ietf.org/doc/draft-meunier-web-bot-auth-architecture/