Research & Data

Do Zoo Sites Block AI Crawlers? 1 of 10 Do

Jun 19, 2026

Zoos live by being found. A family deciding where to spend a Saturday, a teacher planning a field trip, a traveler searching an unfamiliar city — they all start with a query, and increasingly that query is answered by an AI assistant reading a zoo's own pages. So a zoo's robots.txt, the root-level file that tells automated agents which paths they may fetch, is quietly a marketing decision.

1 of 10 Zoo sites block at least one AI crawler.

Every zoo domain we checked returned a parseable robots.txt, so the denominator and the count are clean: ten sites had a policy, and a single one of them disallows an AI crawler. That works out to a 10% block rate. Every figure here is read straight from the sealed snapshot; nothing is estimated, modeled, or extrapolated.

The lone blocker is brookfieldzoo.org, the Chicago Zoological Society's Brookfield Zoo. The other nine policied zoos leave the door open. Against the corpus, where 317 of 1203 sites with a policy gate at least one crawler for a 26.4% rate, zoos sit well under half the average — one of the more open visitor-attraction categories in this edition.

The One Zoo That Gates, and the Nine That Do Not

What makes zoos distinctive is not how many block, but which one does — and how comprehensively. brookfieldzoo.org is the only gate in the set, and it is not a half-measure. Its robots.txt names GPTBot, ClaudeBot, Google-Extended, CCBot, Bytespider, Meta-ExternalAgent, Amazonbot, and Applebot-Extended — that is OpenAI, Anthropic, Google, Common Crawl, ByteDance, Meta, Amazon, and Apple, all disallowed by name. When Brookfield closes, it closes the whole leaderboard at once.

The open zoos are a roll call of the country's best-known animal parks: sandiegozoo.org, bronxzoo.com, stlzoo.org, columbuszoo.org, lazoo.org, houstonzoo.org, philadelphiazoo.org, cincinnatizoo.org, and sandiegozoowildlifealliance.org. None of them disallows an AI agent. A major metropolitan zoo runs on attendance and membership, and the pages that drive both — hours, tickets, exhibit guides, animal cams, conservation programs — are meant to be found, cited, and surfaced, including by an AI assistant fielding a question about where to see a panda. It is the same outreach instinct that keeps the aquarium category gating nothing at all.

The only zoo blocker in the set is brookfieldzoo.org, the Brookfield Zoo.

Unlike some categories in this edition, no zoo domain went silent at the seal. Every one returned a file we could parse, which is why the denominator is a clean ten rather than a number trimmed by timeouts. That makes the zoo read unusually tidy: there is no ambiguity about which sites were measured, and no domain sits in a "neither allow nor block" bucket waiting to be re-checked.

What This 10% Block Rate Actually Means

A robots.txt directive is a public request, and the zoo read is almost entirely "request granted." The honest interpretation is that, as a category, zoos behave far more like open publishers than like data fortresses. Their public content is an outreach asset rather than a competitive moat, so keeping it readable by retrieval agents extends each institution's reach rather than threatening it. For a venue that needs a steady flow of first-time visitors, being the source an answer engine quotes is free top-of-funnel reach.

Brookfield is the instructive exception. One comprehensive blocker in a ten-file sample is enough to put a number on the board, and it lands the category at a 10% rate. The small sample sharpens this rather than weakening it: with ten policied files, the read is really a story about ten named institutions and one decision at brookfieldzoo.org. That concentration is itself the finding — in zoos, AI-access posture is not set by a broad wave of gating but by whether an individual flagship chooses to wall its pages off.

Zoo sites post a 10% AI-crawler block rate.

This is a very different shape of story than the most-gated categories in the edition. Where news sites overwhelmingly block AI crawlers because their archives are the product, zoos treat their pages as a reason to be visited. The contrast is the point: a 26.4% corpus average hides categories that range from attraction-as-outreach to data-as-asset, and zoos sit firmly on the outreach side.

Where Zoos Sit Among Similar Categories

A 10% block rate places Zoos at the open end of the ranking — gated by a single exception, not by a wave. The focused window below shows Zoos beside its nearest neighbors, verbatim from the sealed snapshot, name first and no rank column.

Category	Sites	With robots.txt	Block at least 1 crawler	Block rate
Billiards	10	9	1	11.1%
Coffee	10	9	1	11.1%
Cybersecurity	10	9	1	11.1%
Zoos	10	10	1	10%
Hunting	10	10	1	10%
Marketing	10	10	1	10%
Productivity	10	10	1	10%

Zoos share their single-blocker reading with a broad, unglamorous band — Hunting, Marketing, and Productivity all land on the same 10% mark, while Billiards, Coffee, and Cybersecurity sit a hair above at 11.1% only because a silent domain shrank their denominator. It is a crowded part of the ranking, which is itself a sign that one-in-ten is a common posture: most sites in these categories want to be readable. The extremes show what the ends look like.

Category	Sites	With robots.txt	Block at least 1 crawler	Block rate
Gaming	9	9	8	88.9%
News	20	17	14	82.4%
FastFood	10	6	0	0%
Hotels	10	3	0	0%

Zoos sit far below Gaming and News, and a notch above the zero-block floor that fast-food and hotel chains define with their fully open policies. The category is open by disposition, gated by exception.

The Bots Brookfield Reaches For

The single zoo blocker is comprehensive, so the more useful corpus context is which bots get gated most broadly — the tokens an institution names first when it decides to close. The cut below shows the most-disallowed bots across all 1203 sites with a robots.txt, bot name first, count next.

Bot	Sites disallowing (of 1203)	Rate
CCBot	234	19.5%
GPTBot	210	17.5%
ClaudeBot	207	17.2%
Bytespider	203	16.9%
Meta-ExternalAgent	178	14.8%

CCBot, Common Crawl's agent, tops the corpus blocklist at 234 sites, with GPTBot and ClaudeBot close behind. brookfieldzoo.org names all five of these — and more — in its disallow group, so Brookfield is not improvising; it is gating the highest-volume training crawlers the whole corpus gates first, just all at once.

Corpus-wide, 317 of 1203 sites block at least one AI crawler.

How the Zoo Snapshot Was Sealed

These figures come from one point-in-time crawl of public robots.txt files, sealed June 19, 2026 under snapshot sha 040215878ac7b85a. For each zoo domain we fetched robots.txt at the root, parsed its user-agent and disallow directives, and recorded whether any AI crawler token was disallowed. We report verbatim counts; nothing is estimated, modeled, or extrapolated. Any domain that returns no parseable file is logged as silent — neither allow nor block — and excluded from the rate; in the zoo set, no domain landed there.

The counting rule is deliberately narrow. A block is an explicit Disallow aimed at a named AI agent — GPTBot, ClaudeBot, CCBot, and the other leaderboard tokens. A zoo can disallow administrative, search, or cart paths without naming an AI agent, and that does not count as an AI block here. Only a directive that names one moves a site into the blocker column, which is why the zoo count is a clean one: brookfieldzoo.org names them, the rest do not.

A note on what the snapshot deliberately does not do. It does not retry a slow host until a file appears, does not follow a redirect into a different domain's policy, and does not infer a block from a site that merely looks unfriendly to bots. Each zoo domain is read once, at seal time, exactly as it answered. That single-read rule is what makes the result content-addressable: anyone holding sha 040215878ac7b85a can re-derive the same ten policied files and the same one blocker. The method favors reproducibility over a generous reading.

Frequently Asked Questions

Q: Which zoo site blocks AI crawlers?

A: brookfieldzoo.org, the Brookfield Zoo. It is the only one of the 10 zoos with a parseable robots.txt that disallows an AI crawler, and it does so comprehensively — naming GPTBot, ClaudeBot, Google-Extended, CCBot, Bytespider, Meta-ExternalAgent, Amazonbot, and Applebot-Extended. That single gate is the entire 10% block rate.

Q: Why do the big zoos leave AI crawlers in?

A: Reach. sandiegozoo.org, bronxzoo.com, houstonzoo.org, philadelphiazoo.org, and the rest run on attendance, membership, and donations — their hours, ticketing, exhibit, and conservation pages are meant to be found and cited, including by AI assistants. For a venue whose business is getting visitors through the gate, being readable extends that goal rather than threatening it.

Q: Does the 10% rate cover all the zoo sites you found?

A: Yes, in this case. Every zoo domain we checked returned a parseable robots.txt, so all ten count toward the rate. In other categories, a domain that returns no parseable file is excluded as silent — but no zoo domain was silent at the seal.

Q: Does a Disallow in robots.txt actually stop an AI crawler?

A: Not by force. robots.txt is an honor-system standard: a cooperative crawler reads it and complies, but the file enforces nothing technically. brookfieldzoo.org signals that AI agents should stay out of its paths; each crawler decides whether to honor that request.

Put AI-Access Data to Work

For a zoo's marketing or digital director — the person who owns how the park shows up online — this snapshot is a baseline worth watching. Most peers stay open while Brookfield gates comprehensively, and the bigger risk for an attraction is the accidental block: a developer drops a broad Disallow into robots.txt and quietly walls the ticketing and exhibit pages off from the answer engines families now ask first. You want to know the week that happens, not at the next site redesign.

Set a recurring crawl that re-reads robots.txt for your own domain and a watchlist of peer zoos weekly, and alert the moment any AI crawler token appears in — or disappears from — a disallow list. US Tech Automations runs exactly that kind of scheduled robots.txt crawl with change alerts and agentic monitoring, so a policy shift surfaces the week it lands rather than at the next annual audit.

A second fit is an AI-search or GEO analyst tracking which attractions remain eligible to surface in answer engines. Their job is to know, continuously, whether the pages they rely on are still readable, and whether a flagship like brookfieldzoo.org is an outlier or the start of a trend. US Tech Automations monitors that drift across a watchlist of domains and routes the alert when a site flips, so the analyst is not re-checking files by hand. See how the agentic monitoring works, and you have a standing read on zoo AI-access posture instead of a one-time count.

Corpus-wide, 330 of 1203 sites publish an llms.txt file.

Key Takeaways

Of the 10 Zoo sites with a parseable robots.txt, 1 blocks at least one AI crawler — a 10% rate, well below the corpus average.
The only blocker is brookfieldzoo.org, the Brookfield Zoo; it gates comprehensively, naming GPTBot, ClaudeBot, Google-Extended, CCBot, Bytespider, Meta-ExternalAgent, Amazonbot, and Applebot-Extended.
The open zoos — sandiegozoo.org, bronxzoo.com, stlzoo.org, columbuszoo.org, lazoo.org, houstonzoo.org, philadelphiazoo.org, cincinnatizoo.org, and sandiegozoowildlifealliance.org — all allow every crawler.
No zoo domain went silent at the seal, so all ten sites count toward the rate.
Corpus-wide, 317 of 1203 sites (26.4%) gate at least one crawler, so zoos sit well under half the average.

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 19, 2026 (snapshot sha 040215878ac7b85a).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Zoo Sites Block AI Crawlers? 1 of 10 Do.” https://ustechautomations.com/resources/blog/do-zoo-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 040215878ac7b85a

Machine-readable data: CSV · JSON · All research & methodology