Research & Data

Do Magic Sites Block AI Crawlers? 4 of 8 Do

Jun 14, 2026

Magic is the rare hobby where half the readable sites slam the door on AI crawlers. Of the 8 Magic sites that published a robots.txt, 4 block at least one AI bot — a 50% rate that towers over almost every other enthusiast vertical in this snapshot. In a corpus where craft and hobby categories mostly leave the gates open, magicians are a striking exception.

The instinct makes sense once you remember what the craft sells. A robots.txt is the plain-text file at a site's root naming which automated visitors may fetch which paths, and an AI crawler pulls pages to feed a model. For a community whose entire economy runs on guarded secrets — method explanations, tutorials, paywalled effects — handing that text to a model is a direct threat. The block rate reads like a profession protecting its inventory.

This report is a single sealed reading, not a trend. We fetched each site's public robots.txt once, on one day, and counted only what the files literally say — which AI user-agents each names in a disallow rule, and which it leaves unmentioned. There is no projection and no comparison to a past month, because the snapshot holds one point in time.

So when the report says eight Magic sites publish a policy and four of them block a crawler, those are verbatim counts. The read that follows — that Magic gates where its content is the secret — rests entirely on those sealed figures, not on inference about intent.

4 of 8 Magic sites block at least one AI crawler.

Why Magic Lands Where It Does

Of the 10 Magic sites we checked, 8 returned a parseable robots.txt, and 4 of those carry a disallow rule aimed at an AI user-agent: themagiccafe.com, penguinmagic.com, vanishingincmagic.com, and geniimagazine.com. That set is telling — a major forum, two retailer-tutorial hubs, and a trade magazine. Each holds the kind of method-rich, hard-won content that a magician would least like to see paraphrased by an answer engine.

The four that allow every crawler are ellusionist.com, theory11.com, conjuringarchive.com, and magicshop.co.uk. Their open posture suggests a different calculation — discoverability and brand reach outweighing secrecy, or a catalog of products rather than exposed methods. The category splits down the middle, and the split tracks how protective each operator feels about its core text.

Half of the eight readable Magic sites block an AI crawler — the highest gating posture of any hobby in this batch.

This 50% rate is dramatic against the hobby field. The permissive orchid-growing slice sits near the floor at a fraction of Magic's rate, and a clean-zero category gates nothing at all — the contrast is the point.

Who Gates the Crawlers — and Who Does Not

The named breakdown is an even four-and-four among the readable sites, with two more expressing no preference.

Magic SitePublished PolicyAI Crawler Stance
themagiccafe.comrobots.txt presentBlocks at least one AI crawler
penguinmagic.comrobots.txt presentBlocks at least one AI crawler
vanishingincmagic.comrobots.txt presentBlocks at least one AI crawler
geniimagazine.comrobots.txt presentBlocks at least one AI crawler
ellusionist.comrobots.txt presentAllows all crawlers
theory11.comrobots.txt presentAllows all crawlers
conjuringarchive.comrobots.txt presentAllows all crawlers
magicshop.co.ukrobots.txt presentAllows all crawlers

Two sites returned no parseable robots.txt — murphysmagic.com and hocus-pocus.com. Under the standard, a missing file means crawlers default to allowed, so they read open, but that is silence rather than a stance. Given how protective the blocking half of this category is, those two are plausible candidates to publish a gating file later.

The four blockers reward a closer look because they are not random. themagiccafe.com is one of the craft's largest forums, an archive of decades of method discussion. geniimagazine.com is a trade publication whose content is the trade. penguinmagic.com and vanishingincmagic.com are retailer-instructor hybrids that sell the very explanations a model would otherwise summarize for free. In each case the disallow rule protects an asset that is text rather than product. That is the through-line: Magic blocks where the content is the secret, and allows where the page is a catalog.

The split also makes Magic unusually volatile as a measurement. With eight readable sites evenly divided, a single operator changing posture would swing the rate by an eighth. The four open sites — and the two with no file — are all candidates to tighten if the community's anxiety about AI paraphrasing deepens, which would push an already-high hobby rate higher still.

Magic sites post a 50% AI-crawler block rate.

Where Magic Sits Among the Categories

To place 50% in context, here is Magic among its nearest neighbors in the block-rate ranking — the verticals filing right around the same line, with a couple just above and below.

CategorySites With robots.txtBlock At Least OneBlock Rate
Cycling9555.6%
Climbing9555.6%
Reference11654.5%
Science10550%
Woodworking10550%
Magic8450%
Quilting8450%
RadioControl8450%
Automotive9444.4%

Reading the window vertically sharpens the point. Just above Magic sit Cycling, Climbing, and Reference, edging past the halfway line; just below, the rate slides toward the 44.4% cluster of Automotive and its peers. Magic lands squarely on the 50% line, sharing it with categories whose pages are reference material rather than pastimes. The neighbors a hobby keeps usually tell you what kind of content it holds, and Magic's neighbors are the information-heavy verticals — confirmation that its operators treat their text as an asset worth gating, not a pastime to broadcast.

That makes Magic the clearest counterexample in this batch to the idea that hobby sites leave their gates open. Most do. Magic does not, and the reason is legible in the named blockers: a forum, a magazine, and two instructor-retailers, each protecting the explanations that are its stock in trade.

Magic ties its 50% rate with Science, Woodworking, Quilting, and RadioControl — but those are mostly information-dense or maker categories, not hobbies. That a craft community gates as hard as a science vertical is the unusual read here. The low-blocking soapmaking slice shows where most crafts actually land, far below Magic. The extremes frame the whole snapshot.

CategoryBlock Rate
Gaming88.9%
News81.3%
Geocaching0%
Tea0%

Which Bots Are Blocked Most, Corpus-Wide

When sites across the corpus do block, the bot leaderboard names which crawlers they target most — counted across all 993 sites, not within Magic.

BotSites Blocking (all 993 sites)
CCBot211
ClaudeBot188
GPTBot187
Bytespider183
Meta-ExternalAgent162

CCBot — Common Crawl's fetcher — tops the list because its archive feeds many downstream models, so disallowing it is the single broadest block a site can apply. That ordering is corpus-wide; the four Magic blockers are part of why categories like this one push the overall number up.

It is worth situating Magic against the floor of the snapshot to see how far it travels. A findability-first hobby like the clean-zero geocaching slice writes no disallow rules at all, because its entire value is being found. Magic occupies the opposite hobby pole: its value is being hidden, and its robots.txt files reflect that.

The two categories sit in the same corpus and the same edition, sealed on the same day, yet they answer the AI-crawler question in completely opposite ways — which is precisely the kind of category-level divergence the snapshot is built to surface. A single number for "do hobby sites block crawlers" would erase it; the per-category cut preserves it.

Corpus-wide, 285 of 993 sites block at least one AI crawler, and Magic sits well above that line.

Key Takeaways

  • Of 10 Magic sites checked, 8 returned a parseable robots.txt, and 4 of those — including themagiccafe.com and geniimagazine.com — block an AI crawler.

  • The 50% block rate is the highest of any hobby in this batch, driven by a craft built on guarded secrets.

  • Four sites allow every crawler and two returned nothing parseable, so the category splits evenly on posture.

  • Corpus-wide, 285 of 993 sites — 28.7% — block at least one AI crawler, a line Magic clears comfortably.

Corpus-wide, 285 of 993 sites block at least one AI crawler.

Frequently Asked Questions

Q: Why do so many Magic sites block AI crawlers?

A: The craft runs on guarded secrets — methods, tutorials, and paywalled effects. Sites like themagiccafe.com and vanishingincmagic.com hold exactly the kind of method-rich text a magician would not want a model to paraphrase, so four of the eight readable sites gate at least one AI bot.

Q: Which Magic sites allow every crawler?

A: ellusionist.com, theory11.com, conjuringarchive.com, and magicshop.co.uk returned a robots.txt that names no AI user-agent in a disallow rule. Their open posture suggests discoverability and brand reach matter more to them than secrecy, or that their public pages are catalog rather than exposed method.

Q: Is a 50% block rate high for a hobby?

A: Very. Most crafts in this snapshot sit far lower — soapmaking and orchids land in the teens. Magic's 50% ties it with information-dense categories like Science and Woodworking rather than its fellow hobbies, which is the unusual part of this slice.

Q: What about murphysmagic.com and hocus-pocus.com?

A: Both returned no parseable robots.txt, so under the standard they default to allowed and read open. That is an absence of a decision, not an open policy. Given how protective the blocking half of this category is, either could publish a gating file later. These figures are verbatim counts from sealed public robots.txt files; nothing is estimated, modeled, or extrapolated.

Put AI-Access Data to Work

The buyer this slice fits first is a horizontal one: a brand-intelligence analyst watching AI-access drift across many categories, for whom Magic is a leading indicator because it gates harder than its peers. The recurring job is to re-crawl a Magic watchlist that includes themagiccafe.com and penguinmagic.com weekly and alert the moment a currently-open site adds a GPTBot or CCBot token to its disallow list — theory11.com and conjuringarchive.com are the ones to watch.

In a 50% category the next block can tip what an answer engine cites, so the value is detecting the change, not the static count. An AI-search agency can run the same cadence across a client corpus to keep its pages eligible for AI answers.

The category-native second ICP is a magic-shop retail ecommerce lead selling tricks and props online, who can monitor whether the forums and the trade magazine driving referral traffic stay crawlable, since their AI-answer visibility shapes top-of-funnel demand. US Tech Automations automates that monitoring with scheduled robots.txt crawls, change alerts, and an AI-access dashboard. See how the platform runs it on our agentic workflows.

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 14, 2026 (snapshot sha 5d5458529dab2773).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Magic Sites Block AI Crawlers? 4 of 8 Do.” https://ustechautomations.com/resources/blog/do-magic-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 5d5458529dab2773

Machine-readable data: CSV · JSON · All research & methodology

About the Author

Garrett Mullins
Garrett Mullins
Workflow Specialist

Helping businesses leverage automation for operational efficiency.