Research & Data

Do Aquarium Sites Block AI Crawlers? Zero of 7 Do

Jun 19, 2026

Public aquariums live on attention. They exist to draw families through the doors, to put a shark or a sea otter in front of a child who has never seen one, and to make their conservation work legible to anyone who searches for it. An institution built on outreach has little reason to hide its pages from the machines that now answer questions on its behalf — and the robots.txt files of the aquarium category say exactly that.

Zero of 7 Aquarium sites block at least one AI crawler.

Of the aquarium domains we checked, 7 returned a parseable robots.txt — the root-level file that tells automated agents which paths they may fetch — and not one of them disallows an AI crawler. That works out to a 0% block rate. Every figure here is read straight from the sealed snapshot; nothing is estimated, modeled, or extrapolated.

There is no lone blocker to name, no flagship that gates the leaderboard. Every policied aquarium leaves the door open to every named bot. Against the corpus, where 317 of 1203 sites with a policy gate at least one crawler for a 26.4% rate, aquariums sit at the very floor — one of the categories that blocks nothing at all.

Why the Whole Aquarium Category Stays Open

When a category posts a zero, the interesting question is not which site blocks but why none of them do. The seven aquariums with a readable policy are a roster of the field's best-known institutions: montereybayaquarium.org, georgiaaquarium.org, aqua.org, neaq.org, nyaquarium.com, aquariumofpacific.org, and mysticaquarium.org. Monterey Bay, Georgia, the National Aquarium, New England, the New York Aquarium, the Aquarium of the Pacific, Mystic — and every one of them allows GPTBot, ClaudeBot, CCBot, Bytespider, and the rest of the leaderboard to read its pages.

That uniformity is not an accident of the sample. Aquariums share a business model with other public-facing nonprofits: a paid visit funds a conservation mission, and the website's job is to convert curiosity into a ticket or a membership.

Their exhibit pages, species guides, and conservation explainers are outreach assets, not competitive moats — a site that wants a marine-biology question routed to its own animal facts has every reason to stay readable by the retrieval agents that field those questions. That consensus is what sets the category apart from one like the trading-card marketplaces, where a couple of platforms wall off their pricing data.

Every aquarium in the set with a parseable robots.txt allows every AI crawler.

Three more aquarium domains — sheddaquarium.org, ripleyaquariums.com, and tennesseeaquarium.org — returned no parseable robots.txt at the seal. They are therefore silent: neither an allow nor a block, and excluded from the rate entirely. That is why the denominator is 7 rather than the 10 sites we checked. It would be wrong to read silence as a stance; it is an artifact of how a host answered at one moment in time, not a policy decision to gate anything.

What This 0% Block Rate Actually Means

A robots.txt directive is a public request, and the aquarium read is "request granted" across the board. The honest interpretation is that, as a category, aquariums behave like open publishers rather than data fortresses. There is no digitized holding here that an institution would guard against bulk harvesting the way a news archive might — the value is the physical visit, and the web pages exist to drive it.

A zero-block category is also a clean signal in a way a one-blocker category never is. When a single site gates, the category's number hinges on one decision that could flip next quarter. When no site gates, the posture is a shared norm rather than a holdout's choice. Aquariums are open by disposition, and the absence of even one exception is the finding.

The small sample sharpens this rather than weakening it. With seven policied files, the read is really a story about ten named institutions, none of which chose to wall off an AI agent. That concentration is itself the point: in aquariums, AI-access posture is set by a category-wide consensus toward openness, not by whether a flagship steward decides to close.

Aquarium sites post a 0% AI-crawler block rate.

This is the opposite shape of story from the most-gated categories in the edition. Where gaming and news sites overwhelmingly block AI crawlers because their content is the product, aquariums treat their content as a reason to be visited. The contrast is the point: a 26.4% corpus average hides categories that range from content-as-asset to content-as-outreach, and aquariums sit at the far outreach end alongside the zoo sites that also gate nothing.

Where Aquariums Sit Among Similar Categories

A 0% block rate places Aquariums at the zero-block floor of the ranking — wide open, with company. The focused window below shows Aquariums beside its nearest neighbors, verbatim from the sealed snapshot, name first and no rank column.

Category	Sites	With robots.txt	Block rate
Astronomy	8	6	0%
Banking	7	7	0%
Boating	10	8	0%

Aquariums share their zero reading with a varied band — Astronomy outreach sites, Banking marketing pages, and Boating retailers all land on the same nothing-blocked mark. It is a telling mix: a science-outreach category, a regulated-finance category, and a consumer-retail category all agree that being readable beats being walled off. The extremes show what the ends of the ranking look like:

Category	Sites	With robots.txt	Block at least 1 crawler	Block rate
Gaming	9	9	8	88.9%
News	20	17	14	82.4%
Hotels	10	3	0	0%
FastFood	10	6	0	0%

Aquariums sit as far from Gaming and News as a category can — those categories gate most of their files, while aquariums gate none. On the floor beside aquariums are hotel chains that define the zero-block norm and fast-food brands, both running on the same open-by-default posture.

The Bots Other Categories Reach For First

No aquarium gates a single bot, so the useful context here is corpus-wide: which bots get disallowed most broadly when a site does decide to close. The cut below shows the most-disallowed bots across all 1203 sites with a robots.txt, bot name first, count next.

Bot	Sites disallowing (of 1203)	Rate
CCBot	234	19.5%
GPTBot	210	17.5%
ClaudeBot	207	17.2%
Bytespider	203	16.9%
Meta-ExternalAgent	178	14.8%

CCBot, Common Crawl's agent, tops the corpus blocklist at 234 sites, with GPTBot and ClaudeBot close behind. Aquariums name none of these — every one of these tokens is allowed across the category. The bots that the broader web gates first are precisely the bots aquariums leave in, which is the whole story of a zero-block category in one table.

Corpus-wide, 317 of 1203 sites block at least one AI crawler.

How the Aquarium Snapshot Was Sealed

These figures come from one point-in-time crawl of public robots.txt files, sealed June 19, 2026 under snapshot sha 040215878ac7b85a. For each aquarium domain we fetched robots.txt at the root, parsed its user-agent and disallow directives, and recorded whether any AI crawler token was disallowed. We report verbatim counts; nothing is estimated, modeled, or extrapolated. The three domains with no parseable file — sheddaquarium.org, ripleyaquariums.com, and tennesseeaquarium.org — are logged as silent, neither allow nor block.

The counting rule is deliberately narrow. A block is an explicit Disallow aimed at a named AI agent — GPTBot, ClaudeBot, CCBot, and the other leaderboard tokens. An aquarium can disallow administrative, search, or cart paths without naming an AI agent, and that does not count as an AI block here. Only a directive that names one would move a site into the blocker column, which is why the aquarium count is a clean zero: none of the seven policied files names an AI agent in a disallow group.

A note on what the snapshot deliberately does not do. It does not retry a slow host until a file appears, does not follow a redirect into a different domain's policy, and does not infer a block from a site that merely looks unfriendly to bots.

Each aquarium domain is read once, at seal time, exactly as it answered. That single-read rule is what makes the result content-addressable: anyone holding sha 040215878ac7b85a can re-derive the same seven policied files and the same zero blockers. The cost is that a host briefly rate-limiting at seal lands in the silent bucket rather than the allow column — reproducibility over a generous reading, which is why three aquariums sit silent rather than counted.

Frequently Asked Questions

Q: Which aquarium site blocks AI crawlers?

A: None of them. All 7 aquariums with a parseable robots.txt — montereybayaquarium.org, georgiaaquarium.org, aqua.org, neaq.org, nyaquarium.com, aquariumofpacific.org, and mysticaquarium.org — allow every named AI crawler. There is no blocker in the set, which is why the category rate is 0%.

Q: Why do aquariums leave AI crawlers in?

A: Reach. Aquariums run on public discovery — their exhibit pages, species guides, and conservation explainers are meant to be found and cited, including by AI assistants answering a marine-biology question. For an institution whose mission is public engagement and whose revenue is the in-person visit, being readable extends that mission rather than threatening it.

Q: Does the 0% rate cover all the aquarium sites you found?

A: No. It covers the 7 sites that returned a parseable robots.txt. Three more — sheddaquarium.org, ripleyaquariums.com, and tennesseeaquarium.org — produced no parseable file at the seal, so they are excluded from the rate rather than counted as an allow or a block.

Q: Does a Disallow in robots.txt actually stop an AI crawler?

A: Not by force. robots.txt is an honor-system standard: a cooperative crawler reads it and complies, but the file enforces nothing technically. Since no aquarium publishes a disallow against an AI agent, the question is moot here — every policied aquarium signals that AI agents are welcome to read its paths.

Put AI-Access Data to Work

For an aquarium's digital director or marketing lead — the person who owns how the institution appears online — this snapshot is a baseline worth watching. The category gates nothing today, which means an answer engine fielding a question about your animals or exhibits can reach your pages. But a zero is only true at seal time: a new CMS, a rights policy, or a vendor default can quietly add a disallow that walls off the very answer engines your visitors now ask. Knowing the week that happens is worth more than discovering it at the next annual audit.

Set a recurring crawl that re-reads robots.txt for montereybayaquarium.org, georgiaaquarium.org, aqua.org, and your own domain, and alert the moment any peer adds an AI crawler token to its disallow list — in a zero-block category, the first site to gate is the news. US Tech Automations runs exactly that kind of scheduled robots.txt crawl with change alerts and agentic monitoring, so a policy shift surfaces the week it lands rather than at the next review.

A second fit is an AI-search or GEO analyst tracking which public institutions remain eligible to surface in answer engines. Their job is to know, continuously, whether the pages they rely on are still readable, and whether a silent domain is a timeout or a hardening stance. US Tech Automations monitors that drift across a watchlist of domains and routes the alert when an institution flips, so the analyst is not re-checking files by hand. See how the agentic monitoring works, and you have a standing read on aquarium AI-access posture instead of a one-time count.

Corpus-wide, 330 of 1203 sites publish an llms.txt file.

Key Takeaways

Of the 7 Aquarium sites with a parseable robots.txt, zero block any AI crawler — a 0% rate, at the very floor of the ranking.
There is no blocker to name: montereybayaquarium.org, georgiaaquarium.org, aqua.org, neaq.org, nyaquarium.com, aquariumofpacific.org, and mysticaquarium.org all allow every crawler.
Three domains — sheddaquarium.org, ripleyaquariums.com, and tennesseeaquarium.org — returned no parseable file at the seal and are excluded from the rate.
Aquariums share the zero-block floor with Astronomy, Banking, and Boating, and sit as far as possible from Gaming (88.9%) and News (82.4%).
Corpus-wide, 317 of 1203 sites (26.4%) gate at least one crawler, so aquariums sit far below the average at the open end.

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 19, 2026 (snapshot sha 040215878ac7b85a).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Aquarium Sites Block AI Crawlers? Zero of 7 Do.” https://ustechautomations.com/resources/blog/do-aquarium-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 040215878ac7b85a

Machine-readable data: CSV · JSON · All research & methodology