Research & Data

Do Cosplay Sites Block AI Crawlers? 4 of 9 Do

Jun 14, 2026

Cosplay is a visual, image-heavy hobby, and that may be exactly why nearly half of its sites with a published policy turn AI crawlers away. Of the 10 cosplay sites in this snapshot, 9 returned a parseable robots.txt, and 4 of those block at least one AI crawler — a 44.4% block rate.

That rate sits above the corpus as a whole and ties cosplay with bonsai at the top of the mid-pack enthusiast categories. The story here is photo-centric content: galleries of original costume work are the kind of material a creator may not want absorbed silently. This is a sealed, point-in-time read of public robots.txt files — no estimates, no forecasts.

4 of 9 Cosplay sites block at least one AI crawler.

A robots.txt file is the plain-text instruction sheet a site publishes for automated crawlers. To gate an AI crawler, the site lists a token such as User-agent: GPTBot and a Disallow: / rule beneath it. In cosplay, four sites carry such a directive against at least one AI user-agent.

Who Gates the Crawlers Here

The four blockers are cosplay.com, worldcosplay.net, acparadise.com, and simcosplay.com — a set heavy on community galleries and costume marketplaces. Each published a robots.txt that disallows at least one named AI crawler, which is consistent with platforms whose value lives in user-uploaded photography.

The sites that returned a robots.txt and allow every AI crawler are kamuicosplay.com, cosplaytutorial.com, cosplayisland.co.uk, cosplayhouse.com, and cosplaysky.com. Several of those are tutorial or retail destinations, where discoverability and instructional reach outweigh any wish to withhold content.

Four named sites — cosplay.com, worldcosplay.net, acparadise.com, and simcosplay.com — disallow at least one AI crawler.

One site, herostime.com, returned no parseable robots.txt. That is neither a block nor an allow; it means no crawl preference is published, which a compliant crawler treats as open by default. It is counted among the 10 cosplay sites checked but sits outside the 9 that returned a parseable policy, so it does not enter the block or allow tally — keeping the 4 of 9 denominator clean.

Cosplay Site	robots.txt Status
cosplay.com	Blocks an AI crawler
worldcosplay.net	Blocks an AI crawler
acparadise.com	Blocks an AI crawler
simcosplay.com	Blocks an AI crawler
kamuicosplay.com	Allows all AI crawlers
cosplaytutorial.com	Allows all AI crawlers
cosplayisland.co.uk	Allows all AI crawlers
cosplayhouse.com	Allows all AI crawlers
cosplaysky.com	Allows all AI crawlers
herostime.com	No parseable robots.txt

Why Cosplay Lands Where It Does

At 44.4%, cosplay sits well above the corpus baseline. Across all 867 sites with a parseable robots.txt, 260 block at least one AI crawler — a 30% rate — so cosplay gates noticeably more than the typical site.

The clearest driver is image ownership. Cosplay galleries are portfolios of costume craft and photography, and the platforms hosting them have a tangible reason to keep that imagery out of model training. The allowers skew toward tutorials and stores, where being found is the whole point. That tension — protect the gallery, expose the catalog — explains why the category splits almost evenly. For the opposite end of that spectrum, where no operator is named by any site, see the prepping report on AI-crawler access.

The 44.4% figure should be read carefully. It counts sites with a parseable file that disallow at least one AI user-agent — not the volume of images protected, and not the strictness of each rule. A site that blocks one bot and a site that blocks every AI crawler both register once. What the figure captures is the decision to gate, and in a category this image-dependent, that decision tracks closely with whether a site hosts original user uploads or simply sells and teaches.

Across all 867 sites with a published policy, 260 block at least one AI crawler — a 30% rate.

How Cosplay Compares to Its Neighbors

Cosplay shares its 44.4% block rate with bonsai and with several broader categories — Automotive, HomeGarden, Genealogy, Watches, and Birding all land at the same mark. The window below centers on cosplay and the categories immediately above and below it.

Category	Sites	With robots.txt	Block ≥1 AI Crawler	Block Rate
Woodworking	10	10	5	50%
Quilting	10	8	4	50%
Automotive	10	9	4	44.4%
Watches	10	9	4	44.4%
Birding	10	9	4	44.4%
Cosplay	10	9	4	44.4%
Bonsai	10	9	4	44.4%
Fashion	9	7	3	42.9%
Running	9	7	3	42.9%
Surfing	10	7	3	42.9%

For scale at the edges, Gaming tops the entire ranking at 88.9% and News follows at 81.3%, while categories like Prepping and Pottery sit at 0%. Cosplay is in the upper-middle band, gating more than most but far from the heaviest categories.

What is notable is the company cosplay keeps at 44.4%: Automotive, HomeGarden, Genealogy, Watches, and Birding are all broader, more commercial categories, yet a niche costume hobby gates at the same rate. The reason is that each had 9 sites return a file and 4 of them block — the arithmetic lines up even though the underlying content could hardly be more different. One step down, the 42.9% group of Fashion, Running, Surfing, and Metal Detecting has just three gating sites out of seven, so a single additional block is what lifts a category from that tier into cosplay's.

Cosplay sites post a 44.4% AI-crawler block rate.

There is a creator-economy angle underneath the split, too. The cosplayers whose work fills these galleries often sell prints, patterns, and commissions, so the platforms hosting their photography carry an implicit duty to protect that imagery from uncompensated training use. A gallery that disallows AI crawlers is, in effect, acting on behalf of its contributors.

The tutorial and retail allowers face the opposite incentive: their material is meant to be discovered, quoted, and acted on, so being readable to an answer engine is a feature rather than a leak. The near-even split is what you would expect when a category contains both kinds of business.

The Operator-Level Picture Across the Corpus

When a cosplay site blocks "an AI crawler," it is naming a specific operator. The focused cut below shows the most-disallowed operators across all 867 sites; Common Crawl leads, with Anthropic and OpenAI close behind.

Operator	Sites Disallowing (all 867 sites)
Common Crawl	194
Anthropic	184
OpenAI	175
Meta	166
ByteDance	163

A gallery platform protecting its imagery typically names these same operators. For the same dynamic in a category that gates at the identical 44.4% rate, see the bonsai AI-crawler report; for a hobby a notch lower, see the metal detecting crawler-blocking breakdown.

The tight clustering at the top — Common Crawl at 194, Anthropic at 184, OpenAI at 175 — reflects that publishers who gate one major operator usually gate the others in the same file. Operators further down, such as Apple at 131 and Cohere at 99, are named more selectively. A cosplay gallery weighing whether to disallow rarely picks one bot and stops; it tends to treat the major operators as a group.

How the Snapshot Was Sealed

Every figure here is a verbatim count from a sealed snapshot of public robots.txt files captured 14 June 2026 and content-addressed with the sha 4247236167461a45. We fetch each site's published file, parse the AI user-agent tokens it disallows, and seal the result; nothing is estimated, modeled, or extrapolated. A site with no parseable file is recorded as exactly that.

The edition behind this category spans 1038 sites overall, 867 with a parseable robots.txt, across 104 content categories. The llms.txt signal appeared on 216 of those sites, or 24.9%.

Sealing the snapshot rather than checking live is what makes the 4 of 9 reproducible. Sites edit their robots.txt over time, so a live re-query would drift; a content-addressed snapshot pins the result to a single moment, 14 June 2026. For cosplay, that means the next snapshot can show precisely which of the four blockers eased up or which of the five allowers tightened — change measured against a fixed baseline rather than a moving one.

Frequently Asked Questions

Q: Which cosplay sites block AI crawlers?

A: Four sites — cosplay.com, worldcosplay.net, acparadise.com, and simcosplay.com — disallow at least one AI user-agent. The five with a parseable file that allow all crawlers are kamuicosplay.com, cosplaytutorial.com, cosplayisland.co.uk, cosplayhouse.com, and cosplaysky.com.

Q: Why might cosplay gate more than the average category?

A: Its content is heavily image-based. Cosplay galleries are original costume photography, and platforms hosting that work have a concrete reason to keep it out of model training — which is why the gallery and marketplace sites block while the tutorial and retail sites stay open.

Q: What does herostime.com showing no result mean?

A: herostime.com returned no parseable robots.txt. That is not a block and not an allow — it means the site publishes no crawl preference, so a compliant crawler reads it as open by default. It is counted among the 10 sites but sits outside the 9 with a parseable policy.

Q: Does a disallow line in robots.txt actually keep a crawler out?

A: Not by force. robots.txt is an honor-system convention — a compliant crawler reads the file and respects the disallow, but the line is a stated request, not a technical wall. The 4 of 9 figure measures published intent, not enforced exclusion.

Q: Why do the tutorial and retail cosplay sites stay open?

A: Their value is in being found. Tutorial destinations like cosplaytutorial.com and retailers like cosplaysky.com gain from AI assistants surfacing their guides and products, so a permissive policy serves them — unlike the gallery platforms, whose original photography is the asset they have reason to withhold.

Put AI-Access Data to Work

The lead buyer for this data is an AI-search or GEO agency tracking which client-eligible corpora remain crawlable across many categories. For cosplay, the recurring job is to re-crawl the set weekly and alert the moment a permissive site such as cosplaysky.com or kamuicosplay.com adds a new AI user-agent to its disallow list, since a flip can pull a client's pages out of an answer engine.

A category-native buyer fits second: a cosplay-costume marketplace growth lead watching whether rival galleries like worldcosplay.net loosen or tighten access to AI shopping and discovery assistants. US Tech Automations runs that monitoring as scheduled robots.txt and llms.txt crawls with change alerts on the domains you name. See the build on the agentic workflows platform.

Corpus-wide, 260 of 867 sites block at least one AI crawler.

Key Takeaways

Of 10 cosplay sites, 9 returned a parseable robots.txt and 4 of those block at least one AI crawler — a 44.4% rate.
The blockers are cosplay.com, worldcosplay.net, acparadise.com, and simcosplay.com; the allowers include kamuicosplay.com, cosplaytutorial.com, and cosplaysky.com.
Cosplay sits above the 30% corpus baseline and ties bonsai, Automotive, Watches, and Birding at 44.4%.
Common Crawl draws the most disallow tokens corpus-wide at 194, with Anthropic at 184 and OpenAI at 175.

This snapshot of Cosplay sites is one slice of a wider dataset; read how many top websites block AI crawlers for the cross-industry view.

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 14, 2026 (snapshot sha 4247236167461a45).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Cosplay Sites Block AI Crawlers? 4 of 9 Do.” https://ustechautomations.com/resources/blog/do-cosplay-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 4247236167461a45

Machine-readable data: CSV · JSON · All research & methodology