Research & Data

Do Geocaching Sites Block AI Crawlers? Zero of 4 Do

Jun 14, 2026

Not one Geocaching site we could read has put up a rule against AI crawlers. Of the 10 sites we checked, only 4 returned a parseable robots.txt — and every one of those four allows all crawlers. There is no gatekeeper in this slice to point to.

The smaller-than-usual readable set is the headline here. A robots.txt is the plain-text file at a site's root that tells automated visitors which paths they may fetch; an AI crawler is the bot that pulls pages to train or feed a model. Six of the ten Geocaching sites returned nothing parseable at all, so this is a category where most operators have not written a published policy in either direction — and the four who did leave the door open.

This report is one sealed reading, not a trend. We fetched each site's public robots.txt once, on a single day, and counted only what the files literally contain. A site with no parseable file is reported as having no published policy — we do not invent a stance for it, and we do not project what it might do next.

There is no comparison to a prior month because the snapshot holds a single point in time. So when the report says four sites allow everything and six published nothing readable, those are verbatim counts, and the analysis that follows rests entirely on them.

0 of 4 Geocaching sites block any AI crawler.

Reading a Clean-Zero Category

Of the 10 Geocaching sites we checked, exactly 4 returned a parseable robots.txt: geocaching.com, opencaching.us, geocachingtoolbox.com, and project-gc.com. Each of those four publishes a policy that names no AI user-agent in a disallow rule, so each allows every crawler we looked for. That is the whole readable picture — four open files, zero blocks.

The other six sites we checked returned no parseable robots.txt: cachly.com, groundspeak.com, thegeocachingjunkie.com, geocachingblog.com, opencaching.com, and geowoodstock.com. We do not infer a stance from a missing file. Under the robots standard a crawler treats an absent file as permission to fetch, so functionally these read open, but the absence is silence, not a decision — and it is why the readable base for Geocaching is just four sites, not ten.

Only 4 of the 10 Geocaching sites we checked returned a parseable robots.txt, and all four allow every crawler.

A zero block rate on such a thin readable base is a different signal than a zero on a fully covered category. To see the contrast with categories that publish more policies, the permissive posture of candlemaking sites sits on a larger readable set, while a vertical that gates more shows what a defensive hobby looks like.

Why a Geocaching Hobby Leaves robots.txt Open

Geocaching runs on shared coordinates, logs, and trackable codes — the entire pastime depends on data being findable. A community whose value is "go here, find this" has little instinct to wall content off from automated readers, and that shows in a clean-zero result across every file we could parse.

The four readable sites span the hobby's core surfaces: a primary listing platform, an open-data alternative, a toolbox, and a stats project. None of them treats its public pages as proprietary in a way that would justify a crawler block, so the open policies are consistent with how the hobby works rather than a coordinated choice.

A future block would be a real signal. If geocaching.com or project-gc.com ever added a GPTBot or CCBot disallow rule, it would mark a shift from "findable is the point" toward treating member logs and trail data as something to protect — the same instinct that drives blocking in higher-stakes categories.

There is also a practical reason the readable set is so thin. Several of the no-policy sites are small community blogs and apps where a robots.txt was simply never configured — geocachingblog.com and thegeocachingjunkie.com read like personal or hobbyist projects, and groundspeak.com is a corporate parent whose public surface may route through other domains. None of that implies a deliberate stance; it implies a category that has not yet treated crawler policy as a setting worth touching.

By contrast, the harder-gating magic-shop slice shows what happens when a community decides its text is worth protecting — its operators write robots.txt files and use them to keep crawlers out. The difference between the two hobbies is not technical sophistication but motivation: one has a reason to gate and the other does not.

Geocaching sites post a 0% AI-crawler block rate.

Who Allows the Crawlers Here

The named list is short by design — four open files and six that returned nothing parseable.

Geocaching Site	Published Policy	AI Crawler Stance
geocaching.com	robots.txt present	Allows all crawlers
opencaching.us	robots.txt present	Allows all crawlers
geocachingtoolbox.com	robots.txt present	Allows all crawlers
project-gc.com	robots.txt present	Allows all crawlers
groundspeak.com	none parseable	No published policy
cachly.com	none parseable	No published policy

We list a representative cut of the no-policy sites above; cachly.com, groundspeak.com, thegeocachingjunkie.com, geocachingblog.com, opencaching.com, and geowoodstock.com all returned nothing parseable. The reader should take the four "allows all" rows as the full set of sites that actually published a decision.

The four that did publish span the hobby's working surfaces, which is part of why the open result feels representative. geocaching.com is the primary listing platform, opencaching.us is the open-data alternative, geocachingtoolbox.com supplies the utilities cachers rely on, and project-gc.com runs the stats and analytics layer. None of these treats its public pages as something to hide. A category whose four most policy-conscious sites all converge on "allow everything" is telling you something durable about how the hobby sees its own data.

Where Geocaching Sits Among the Quiet Categories

Geocaching shares the floor of the ranking with the other categories that block nothing at all. Here is the cluster at the bottom, where the published block rate is zero.

Category	Sites With robots.txt	Block Rate
Prepping	8	0%
Pickleball	10	0%
Embroidery	8	0%
Candlemaking	8	0%
Geocaching	4	0%
Pottery	9	0%
Boating	8	0%

Geocaching's distinction within this group is its small readable base: most clean-zero categories returned eight or more parseable files, while Geocaching could only be read on four. A craft like the one-blocker leathercraft slice sits just above the zero line. For the full sweep of the corpus, the extremes show how far the heaviest blockers run from this floor.

Category	Block Rate
Gaming	88.9%
News	81.3%
Tea	0%
Geocaching	0%

The Bots Sites Disallow Most, Corpus-Wide

When sites elsewhere in the corpus do block, the bot leaderboard names which crawlers they target most — counted across all 993 sites, not within Geocaching.

Bot	Sites Blocking (all 993 sites)
CCBot	211
ClaudeBot	188
GPTBot	187
Bytespider	183
Meta-ExternalAgent	162

CCBot — Common Crawl's fetcher — tops the list because its archive feeds many downstream models, making it the single broadest block a site can apply. None of that touches Geocaching, where the four readable files name no bot at all.

What the leaderboard captures is the menu of choices Geocaching sites have declined to use. Every one of those nine bot tokens is a line an operator could add to a robots.txt to turn a specific crawler away, and across the corpus 285 sites have written at least one such line. The four readable Geocaching files contain none of them.

That is the cleanest way to read a 0% rate: not as a site being unreachable, but as a category that has looked at the same set of available blocks as everyone else and chosen, so far, to apply zero. The interesting question is which site moves first, and that is exactly what a monitoring cadence is built to catch.

Corpus-wide, 285 of 993 sites block at least one AI crawler, while every readable Geocaching file allows them.

Key Takeaways

Of 10 Geocaching sites checked, only 4 returned a parseable robots.txt, and all 4 allow every AI crawler.
The 0% block rate places Geocaching at the floor of the ranking alongside Embroidery, Candlemaking, and Pickleball.
Six sites returned nothing parseable, so the readable base is just four — a thinner foundation than most clean-zero categories.
Corpus-wide, 285 of 993 sites — 28.7% — block at least one AI crawler, a posture absent from this category.

Corpus-wide, 285 of 993 sites block at least one AI crawler.

Frequently Asked Questions

Q: Why does Geocaching show only 4 sites with a robots.txt?

A: Of the 10 sites we checked, six — including groundspeak.com and cachly.com — returned nothing parseable, so only geocaching.com, opencaching.us, geocachingtoolbox.com, and project-gc.com had a readable policy. We report only what the files actually contain, so the readable base for Geocaching is four, not ten.

Q: Does a 0% block rate mean Geocaching sites cannot be crawled?

A: The opposite. A 0% block rate means every readable file allows AI crawlers; none of the four published a disallow rule against a model bot. The six sites with no parseable file also default to allowed under the standard, so functionally the whole category reads open.

Q: Would it matter if a Geocaching site started blocking AI crawlers?

A: Yes. A block on geocaching.com or project-gc.com would mark a shift from "findable is the point" toward protecting member logs and trail data. Detecting that first edit is the whole reason to monitor a clean-zero category rather than assume it stays open.

Q: How do you know none of these sites block a crawler?

A: We read the public robots.txt of each of the four sites that published one and found no AI user-agent in any disallow rule. The figures are verbatim counts from sealed public files — nothing is estimated, modeled, or extrapolated, and a missing file is reported as missing, not guessed.

Put AI-Access Data to Work

The buyer this clean-zero slice fits first is a horizontal one: an AI-search and GEO agency tracking which client-eligible sites stay crawlable. For an agency, the recurring job is to re-crawl a watchlist that includes geocaching.com and project-gc.com weekly and alert the moment any of the four currently-open sites adds a GPTBot or CCBot token to its disallow list. In a 0% category, the first block is the entire story, and catching it early protects whether a client's pages can surface in answer engines. A market-research lead can run the same cadence to map where AI access is drifting.

The category-native second ICP is a geocaching-gear and GPS retailer selling trackables and handhelds online, who can watch whether the listing platforms its referral traffic depends on stay open to model crawlers. US Tech Automations automates that watch with scheduled robots.txt crawls, change alerts, and an AI-access policy dashboard. See how the platform runs it on our agentic workflows.

This snapshot of Geocaching sites is one slice of a wider dataset; read how many top websites block AI crawlers for the cross-industry view.

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 14, 2026 (snapshot sha 5d5458529dab2773).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Geocaching Sites Block AI Crawlers? Zero of 4 Do.” https://ustechautomations.com/resources/blog/do-geocaching-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 5d5458529dab2773

Machine-readable data: CSV · JSON · All research & methodology