Research & Data

Do Ham Radio Sites Block AI Crawlers? 2 of 6 Do

Jun 14, 2026

The most distinctive thing about ham radio is not its block rate — it is how many of its sites publish no policy at all. Of the 10 amateur-radio sites we checked, only 6 returned a parseable robots.txt, the lowest coverage among the hobbies in this batch. Of those 6, 2 block at least one AI crawler — a 33.3% rate that sits just above the corpus average.

That low coverage is the headline. Four ham radio sites have no robots.txt to read at all, which is a lot of silence for one small vertical. Among the sites that did publish, qsl.net and dxengineering.com are the two that gate; the rest open the door.

2 of 6 Ham Radio sites block at least one AI crawler.

This report comes from one sealed snapshot of public robots.txt files. A robots.txt file lists which automated agents a site allows; an AI crawler is a bot collecting pages for a language model. We read what each site published, recorded it, and counted. Nothing here is inferred.

Who Gates the Crawlers Here

The two blockers are qsl.net and dxengineering.com. The first is a long-running hub of operator pages and call-sign hosting; the second is a major equipment retailer. Both have a reason to fence: one holds a deep archive of user-generated pages, the other a large product catalog, and each is the kind of asset a model would happily ingest.

The allowers are arrl.org, qrz.com, eham.net, and hamradio.com — the national association, a call-sign lookup, a reviews community, and a retailer. All four returned a robots.txt that permits every crawler we tested. The category's biggest institutional site, in other words, leaves the gate open.

Of the 10 Ham Radio sites checked, 6 returned a parseable robots.txt, and 2 block at least one AI crawler.

Then there is the silence. dxsummit.fi, universal-radio.com, hamuniverse.com, and ac6v.com returned no parseable robots.txt at all. These are older, hand-built corners of the amateur web, and the absence of a policy fits that character — many predate the era when AI crawling was a concern anyone configured for.

Four Ham Radio sites returned no parseable robots.txt at all.

That silence reshapes how the rest of the numbers should be read. When four of ten sites publish no rule, the category's effective openness is higher than its block rate alone suggests — those four are crawlable by default, on top of the four that explicitly allow. Only two sites in the entire vertical actively restrict anything.

For a hobby whose web grew out of personal pages, BBS-era utilities, and volunteer-run reference archives, that mix of permission and silence is exactly what you would expect, and it is the honest shape of the data. The pottery report shows a craft vertical with far higher policy coverage and no blocks at all.

Only 2 of 6 Ham Radio sites with a policy gate a crawler.

What This Block Rate Actually Means

Across the corpus, 260 of 867 sites block at least one AI crawler — a 30% rate. Ham radio's 33.3% sits just above that line, but the figure rests on a small base of only 6 sites with a policy. Read it carefully: the rate is ordinary, even unremarkable, and the real signal is the coverage gap beneath it.

Corpus-wide, 260 of 867 sites block at least one AI crawler, a 30% rate.

A category where four of ten sites publish no robots.txt is one where most of the web is governed by no written rule. That is neither a block nor an invitation; it is a vacuum a compliant crawler treats as open by default. For amateur radio — a hobby with a long do-it-yourself web tradition — that vacuum is the most honest thing the data shows.

Guarding against over-reading matters here. It would be easy to dramatize 33.3% as "ham radio gates more than average," but the base is six sites, and two of them are a retailer and an archive with obvious data to protect. Strip those, and the institutional anchors of the hobby — the national association at arrl.org, the call-sign lookup at qrz.com — are wide open. The category is not defensive; it is uneven, with a few resourced sites making deliberate choices and a long tail making none at all.

The focused window below places ham radio among the categories ranked nearest it. Every value is a verbatim sealed count.

Ham Radio and Its Nearest Neighbors

Category	Sites	With robots.txt	Block ≥1 AI bot	Block rate
Golf	10	8	3	37.5%
Antiques	10	8	3	37.5%
Travel	9	9	3	33.3%
Weather	10	6	2	33.3%
Beauty	10	6	2	33.3%
Agriculture	10	9	3	33.3%
HamRadio	10	6	2	33.3%
Yoga	10	10	3	30%
Beekeeping	10	10	3	30%

Ham radio sits in a band of everyday categories — travel, weather, agriculture — none gating far from the corpus norm. Notably, weather and beauty share its exact shape: 6 sites with a policy, 2 of them blocking. For contrast, the corpus extremes look nothing like this middle ground.

Category	Sites	With robots.txt	Block ≥1 AI bot	Block rate
Gaming	9	9	8	88.9%
News	20	16	13	81.3%
Prepping	10	8	0	0%

Gaming and news lock down; prepping sits at zero. Ham radio lands squarely in between, just above average. The prepping report covers a sibling preparedness hobby that gates nothing at all.

Which Bots Are Blocked Most Across the Corpus

A category rate does not reveal which crawlers get named. The bot leaderboard below counts, across all 867 sites in the snapshot, how many disallow each named AI bot. This is corpus-wide context; the two ham radio blockers contribute to these totals but are a tiny share of them.

Bot	Sites disallowing (all 867 sites)	Share
CCBot	194	22.4%
ClaudeBot	171	19.7%
GPTBot	170	19.6%
Bytespider	163	18.8%
Meta-ExternalAgent	145	16.7%

CCBot leads at 194 sites, with ClaudeBot and GPTBot just behind. A site like dxengineering.com that decides to fence its catalog usually names several of these at once. The stamp collecting report covers a collecting hobby where more sites name these same bots.

The leaderboard also explains why a low-coverage category like ham radio barely registers in the corpus totals. With only two blockers, ham radio can add at most a handful of entries to a list where CCBot alone is named by 194 sites. The disallow energy in the snapshot lives in heavily commercial and editorial verticals, not in volunteer-run hobby webs. When you read a corpus-wide figure, you are mostly reading the choices of news sites, gaming sites, and large retailers — categories ham radio sits nowhere near.

How the Snapshot Was Sealed

We fetched each site's public robots.txt, parsed it for AI user-agent rules, recorded the outcome, and sealed the file set. The figures here are verbatim counts; nothing is estimated, modeled, or extrapolated. The 2 of 6 result and the four no-policy sites are direct reads of what each domain published — no sampling, no projection.

A missing robots.txt is not a block — it is the absence of any published rule.

The snapshot is content-addressed under sha 4247236167461a45 and dated 14 June 2026. It covers 1038 sites overall, 867 with a parseable robots.txt, across 104 categories; 216 sites — 24.9% — also publish an llms.txt file. The ham radio slice is one small, low-coverage window into that whole.

The coverage caveat is the one to carry forward. For most categories we read ten policies; for ham radio we could read only six, because four sites published nothing parseable. That changes how to read every downstream figure: the 33.3% rate describes the six sites with a rule, while the four silent sites sit outside it entirely.

A careful analyst treats those four as a separate bucket — neither allowing nor blocking on the record — rather than folding them into either side. This is a deliberate methodological choice, not an oversight: counting silence as permission, or as a block, would manufacture certainty the published files do not support. The honest reading is that ham radio's web is mostly ungoverned by written rule, with two clear blockers at its commercial and archival edges.

Put AI-Access Data to Work

An AI-search and GEO agency tracking which client corpora remain eligible for AI ingestion is the buyer this data serves first. Such a team can re-crawl the ham radio set weekly and alert the moment a silent site like hamuniverse.com adds a robots.txt, or a permissive one like qrz.com adds an AI bot token to its disallow list — either change shifts whether that content can surface in an answer.

The recurring job is watching both the coverage gap and the 2 of 6 baseline for drift, not reading them once. A brand-intelligence analyst monitoring AI-access across many hobby categories is the natural second buyer.

The category-native role is a ham-radio-gear retailer's marketing lead who wants to know whether competitors like dxengineering.com or hamradio.com stay open to AI discovery, since AI-answer visibility can route buyers to a rival storefront. US Tech Automations automates this monitoring with scheduled robots.txt and llms.txt crawls, change alerts, and an AI-access dashboard. See how agentic monitoring workflows run.

Frequently Asked Questions

Q: Does a site without a robots.txt block AI crawlers by default?

A: No — the opposite. With no robots.txt, there is no rule to obey, and compliant crawlers treat the absence as permission. The four silent ham radio sites are effectively open to bots, even though they published no policy saying so.

Q: Which Ham Radio sites block AI crawlers?

A: qsl.net and dxengineering.com are the 2 of 6 that disallow at least one AI crawler. One hosts a large archive of operator pages; the other is an equipment retailer with a sizable catalog — both have data worth fencing.

Q: Why does Ham Radio have so few sites with a robots.txt?

A: Only 6 of 10 ham radio sites returned a parseable robots.txt — the lowest coverage in this batch. Several are older do-it-yourself sites that predate AI-crawl concerns, so they never added a policy file at all.

Q: Is the 33.3% block rate meaningful on so small a base?

A: It is real but thin. The rate rests on just 6 sites with a policy, so the more durable signal is the four sites with no policy. A single change among the six would move the rate sharply, which is why monitoring the baseline matters.

Key Takeaways

Ham radio's block rate sits just above the corpus average, but its real story is low coverage — four of ten sites publish no robots.txt at all.

Only 2 of 6 Ham Radio sites with a policy gate a crawler.

2 of 6 Ham Radio sites block at least one AI crawler — a 33.3% rate.
Only 6 of 10 ham radio sites returned a parseable robots.txt.
qsl.net and dxengineering.com are the blockers.
Corpus-wide, 260 of 867 sites block, a 30% rate.
CCBot leads the bot list at 194 sites across all 867.

See where Ham Radio sites fit in the broader trend in our study of how many top websites block AI crawlers.

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 14, 2026 (snapshot sha 4247236167461a45).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Ham Radio Sites Block AI Crawlers? 2 of 6 Do.” https://ustechautomations.com/resources/blog/do-ham-radio-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 4247236167461a45

Machine-readable data: CSV · JSON · All research & methodology