Research & Data

Do Reef Keeping Sites Block AI Crawlers? 1 of 6 Do

Jun 14, 2026

Reef-keeping forums and livestock shops are some of the most permissive publishers on the open web. We read the published robots.txt file of every reef site we track, and only one of them tells an AI crawler to stay out. The rest leave their doors open to every model-training and answer-engine bot we look for.

1 of 6 Reef Keeping sites block at least one AI crawler.

That headline sits well under the corpus line. Across the 934 sites in this edition with a readable robots.txt, 277 block at least one AI crawler — a 29.7% rate. Reef keeping comes in at less than half of that. This report names exactly which site gates crawlers, which ones stay open, and where the slice lands among its nearest neighbors. Every figure here is a verbatim count from public robots.txt files sealed on June 14, 2026.

Which Reef Sites Gate the Crawlers Here

Of the 9 reef-keeping sites we checked, 6 returned a parseable robots.txt file. Exactly one of those — reefcentral.com — disallows at least one AI crawler token. The community's largest classifieds-and-discussion hubs that publish a policy lean toward access, not restriction.

The named allowers are concrete. reef2reef.com, bulkreefsupply.com, marinedepot.com, reefs.com, and melevsreef.com each return a robots.txt that does not block the AI tokens we scan for. A separate set — nano-reef.com, reefbuilders.com, and reeftank.com — returned no parseable robots.txt at all, which is not a block; it simply means there is no published rule to read.

Reef KeepingSites CheckedWith robots.txtBlock ≥1 AI CrawlerBlock Rate
Reef Keeping96116.7%

Only reefcentral.com publishes a rule that turns an AI crawler away; every other reef site with a policy leaves access open.

The pattern is worth pausing on. A hobby built around forums, build threads, and gear catalogs has little of the licensing pressure that pushes a newsroom or a streaming catalog to lock things down. For a sense of where a comparably open enthusiast vertical lands, the tabletop RPG AI-access report tells a parallel story with a different mix of publishers.

It is also worth noting what kind of site does the one block. reefcentral.com is a long-running discussion-and-classifieds hub, exactly the sort of property that accumulates years of user-generated threads. When a forum gates AI crawlers, the motive is usually less about commercial licensing and more about protecting member-contributed content from being absorbed wholesale. That a single forum reached that decision while the retail and brand sites did not is the most concrete read this slice offers.

The allowers, by contrast, span the commercial heart of the hobby. bulkreefsupply.com and marinedepot.com are equipment retailers whose product and care-guide pages benefit directly from being surfaced in AI answers; melevsreef.com is a personal build-and-guide site whose author has every reason to want the content found. None of them publishes a rule against crawlers, which is the quiet majority position across the readable reef policies.

What This 16.7% Block Rate Actually Means

A single blocker out of six readable policies is about as low as a category with any blocking at all can go. It signals a community where most operators either have not weighed AI access yet or have decided that broad discoverability — recipes for reef chemistry, coral-care threads, equipment specs — is worth more than gating it.

Reef Keeping sites post a 16.7% AI-crawler block rate.

Read against the wider edition, that is the story: reef keeping is firmly on the permissive end. It shares the same 16.7% reading as Retail in this snapshot, and sits just under Finance and just above Education. The hobby verticals around it — the ones with low commercial stakes — cluster in the same low band.

There is a second way to read the slice that matters for anyone tracking it over time. With only 6 readable policies and a single blocker, the rate is sensitive: if one more of the open reef sites were to publish a disallow rule next quarter, the category's standing in the ranking would move noticeably even though the underlying hobby had not changed. A small, permissive vertical like this is one where drift shows up early and clearly, which is precisely what makes it worth watching as a leading indicator rather than dismissing as low-traffic.

The signal is not the current number alone — it is how stable that number stays.

Where Reef Keeping Sits Among Similar Categories

The table below is a focused window centered on reef keeping plus its nearest neighbors in the block-rate ranking. It is not the full 112-category list — just the band this category lives in, so you can see how it compares to the verticals immediately above and below it.

CategorySitesWith robots.txtBlock ≥1Block Rate
Mycology1010220%
Sewing1010220%
Finance1211218.2%
Retail1512216.7%
Reef Keeping96116.7%
Education97114.3%
Government98112.5%
Crypto98112.5%

For contrast, the extremes of the edition sit far away from this band. The few categories that gate AI most aggressively look nothing like a reef forum.

CategoryBlock Rate
Gaming88.9%
News82.4%
Pickleball0%
Prepping0%

The gap is the entire point. Reef keeping lives near the quiet floor of the ranking, well below the corpus average, while news and gaming pages cluster near the top. A reader looking at the fully-open end of the spectrum can compare reef keeping to the pickleball AI-access report, which records no blockers at all.

The Operator-Level Picture Across All 934 Sites

Inside the one reef site that does block, the disallowed tokens are not unusual — they are the same operators that lead the corpus-wide leaderboard. Across all 934 sites with a readable policy, the most-disallowed operators are concentrated at the top.

OperatorSites Disallowing (all 934 sites)
Common Crawl204
Anthropic194
OpenAI187
Meta177
ByteDance175

Across all 934 sites, Common Crawl is the single most-disallowed operator, named in 204 published policies.

These corpus-wide counts give the reef numbers context. When a reef site does decide to gate, it reaches for the same handful of operator tokens every other category reaches for first. The reef community is not inventing a new blocking pattern; it is simply opting into the common one far less often than the typical publisher does. For an enthusiast vertical with a slightly higher blocking habit, the rockhounding AI-access report shows where a similar hobby lands a step up the ranking.

Corpus-wide, 277 of 934 sites block at least one AI crawler.

Methodology

The figures here come from a single sealed snapshot. We fetched the public robots.txt file of each reef-keeping site, parsed the user-agent and disallow directives, and recorded whether any AI crawler token was disallowed at the site root. A site that returns no robots.txt is recorded as having no published policy — not as a blocker and not as an explicit allower. In this research, nothing is estimated, modeled, or extrapolated; every count is read directly from the files as they existed at seal time.

The snapshot was content-addressed and sealed so the numbers are reproducible. Here is how the pass runs:

  1. Collect. Request each site's /robots.txt and store the raw response verbatim.

  2. Parse. Extract user-agent blocks and disallow rules, matching against the known AI-crawler token list.

  3. Seal. Hash the full snapshot and freeze it under sha 760275d49a628cc3 so the counts cannot drift.

  4. Aggregate. Tally per-category and corpus-wide totals from the sealed records only.

robots.txt is an honor-system file, so these counts measure stated intent, not enforcement. A disallow line is a request a crawler may or may not respect. We also draw a careful line between a site that publishes an allow-everything policy and one that publishes nothing at all: the first is an explicit choice, the second is silence, and conflating them would overstate how settled the reef community's stance really is. The three no-policy reef sites are reported as exactly what they are.

Frequently Asked Questions

Q: Which reef-keeping site actually blocks an AI crawler?

A: reefcentral.com is the only one of the 6 reef sites with a parseable robots.txt that disallows at least one AI crawler token. The other readable policies — including reef2reef.com and bulkreefsupply.com — do not block the crawlers we scan for.

Q: Why is the reef-keeping block rate so much lower than the corpus average?

A: Reef keeping posts a 16.7% block rate against the corpus-wide 29.7%. As a low-commercial hobby built on forums and gear catalogs, it carries little of the content-licensing pressure that drives newsrooms and streaming sites to gate crawlers, so most operators leave access open.

Q: Do the sites without a robots.txt count as allowing crawlers?

A: No. nano-reef.com, reefbuilders.com, and reeftank.com returned no parseable robots.txt in this snapshot. That means they publish no rule for a crawler to read — it is neither a recorded block nor an explicit allow, and we report it as exactly that.

Q: Does a disallow line in robots.txt actually stop a crawler?

A: Not on its own. robots.txt is a voluntary standard, so a disallow rule states a preference that a well-behaved crawler honors and a non-compliant one can ignore. The figures here capture published intent, not enforced access.

Key Takeaways

Reef keeping is one of the open ends of the AI-access spectrum. One readable policy out of six gates a crawler, the rest stay open, and the slice lands far below the corpus norm — a community choosing discoverability over restriction.

  • 1 of 6 reef sites with a robots.txt blocks at least one AI crawler — a 16.7% rate.

  • reefcentral.com is the sole named blocker; reef2reef.com, bulkreefsupply.com, and the other named allowers stay open.

  • The 16.7% reading sits well below the 29.7% corpus-wide block rate.

Put AI-Access Data to Work

The most realistic buyer of this data is a horizontal monitoring customer, not a niche reef shop. An AI-search and GEO agency tracking which client-eligible domains stay crawlable can fold reef keeping into a weekly sweep: re-crawl reefcentral.com and the five open reef policies every Monday, and alert the account team the moment a previously-open site like reef2reef.com adds a new AI-crawler token to its disallow list — a signal that a client's content may be dropping out of answer-engine reach.

A competitive-intelligence analyst watching AI-access drift across many hobby categories can run the same job to spot the first sites in a permissive vertical that begin to gate.

The category-native second buyer is a reef-livestock ecommerce lead who wants their coral and gear pages eligible for AI shopping answers; they would monitor their own robots.txt against this baseline so a misconfigured rule never silently removes them. US Tech Automations runs exactly this kind of scheduled robots.txt and llms.txt monitoring with change alerts and an AI-access dashboard. See how the agentic monitoring workflows handle recurring drift detection.

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 14, 2026 (snapshot sha 760275d49a628cc3).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Reef Keeping Sites Block AI Crawlers? 1 of 6 Do.” https://ustechautomations.com/resources/blog/do-reef-keeping-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 760275d49a628cc3

Machine-readable data: CSV · JSON · All research & methodology

About the Author

Garrett Mullins
Garrett Mullins
Workflow Specialist

Helping businesses leverage automation for operational efficiency.