Research & Data

Do Dental Sites Block AI Crawlers? 1 of 9 Do

Jun 21, 2026

A patient choosing a dentist often starts where every local-service search starts now — an AI assistant fielding "dentist near me accepting new patients" or "best Invisalign provider in [city]." That assistant reads the open web before it answers, and the robots.txt files of dental chains, associations, and consumer platforms shape whether those pages are readable at all. One organization in the dental set has decided the answer is no.

1 of 9 Dental sites with a parseable robots.txt blocks at least one AI crawler — an 11.1% block rate.

Of the 12 dental domains we checked, 9 returned a parseable robots.txt — the root-level file that tells an automated agent which paths it may fetch — and 1 of those 9 disallows at least one named AI crawler. Every figure here is read straight from the sealed snapshot; nothing is estimated, modeled, or extrapolated.

Against the corpus, where 318 of 1247 sites with a policy gate at least one crawler for a 25.5% rate, dental sits well below that average — one holdout, eight open doors. The holdout tells a specific story about which content a dental organization decides is proprietary enough to wall off.

The One Blocker: SmileBrands

The single dental domain that blocks an AI crawler at the seal is smilebrands.com, the platform behind Bright Now! Dental and several other DSO (dental service organization) brands. SmileBrands blocks GPTBot — OpenAI's training crawler — while leaving other named agents including ClaudeBot, CCBot, and Bytespider to read its pages freely.

1 of 9 dental sites with a parseable robots.txt blocks at least one named AI crawler.

A GPTBot-specific block is one of the most common selective patterns in the corpus: an organization that has decided its content should not feed a training pipeline can block OpenAI's harvesting agent while leaving retrieval agents — the bots that answer live queries rather than train models — unrestricted. SmileBrands' policy is narrower than a full AI close; it is a training exclusion, not a search exclusion.

That distinction matters for how to read the 11.1% rate. Eight of 9 dental sites allow every named agent, including the retrieval agents that power AI-assisted local-service search. The one exception gates a training crawler, not a query-time retrieval agent.

8 of 9 dental sites with a parseable robots.txt allow every AI crawler.

What an Open Policy Means for Dental

The 8 domains that allow every named agent are a cross-section of the sector: asda.org (American Student Dental Association), ada.org (American Dental Association), dentalplans.com, dentalcare.com, colgate.com, aspen.dental, pacific-dental.com, and dentalworks.com. Professional associations, consumer insurance platforms, consumer-health brands, and large DSOs — and every one of them allows GPTBot, ClaudeBot, CCBot, Bytespider, and the rest of the named agents to read its pages.

For a service business whose growth depends on new-patient acquisition, the logic of staying open is straightforward. A patient who asks an AI assistant about a procedure, a plan, or a provider network will receive an answer drawn from whatever pages that agent can read. A dental platform that walls itself off disappears from those answers. The 8 open sites have implicitly decided that the cost of being cited in an AI answer — and potentially converting that referral — exceeds the benefit of withholding content.

Three domains — heartlanddentalcare.com, greatlakesdental.com, and koolsmiles.com — returned no parseable robots.txt at the seal. They are therefore silent: neither an allow nor a block, and excluded from the rate entirely. That is why the denominator is 9 rather than the 12 sites we checked.

The contrast with a category like accounting firms, where 50% of sites block at least one AI crawler, is instructive. Accounting organizations treat research and guidance content as proprietary; dental chains treat the same web presence as an acquisition channel and leave it open.

Where Dental Lands Against Other Categories

An 11.1% block rate places dental well below the corpus average and in the range of other service-business categories that block infrequently. The focused window below shows comparable categories beside dental, verbatim from the sealed snapshot.

Category	Sites	With robots.txt	Block at least 1 crawler	Block rate
Insurance	10	9	1	11.1%
Marketing	10	10	1	10%
Veterinary	12	8	0	0%

Dental sits alongside Insurance and Marketing — categories where a single holdout sets the block rate in the low double digits while the rest of the set stays open. The contrast with the higher-block categories shows where the divide runs.

Category	Sites	With robots.txt	Block at least 1 crawler	Block rate
News	20	17	14	82.4%
Healthcare	10	9	6	66.7%
Accounting	10	8	4	50%

Dental posts an 11.1% AI-crawler block rate.

Healthcare — the broader health-information category — blocks at 66.7%, and Accounting blocks at 50%. Dental's 11.1% is notably lower than either adjacent professional category, which reflects a meaningful difference in how dental service organizations and associations classify their web content: more like a lead-generation storefront than a content archive.

Which Crawlers the Dental Holdout Blocks

The single dental blocker — smilebrands.com — targets GPTBot specifically. In the corpus context, GPTBot is the second-most-blocked crawler across all 1247 sites with a robots.txt.

Bot	Sites disallowing (of 1247)	Rate
CCBot	234	18.8%
GPTBot	211	16.9%
ClaudeBot	207	16.6%
Bytespider	203	16.3%
Meta-ExternalAgent	178	14.3%

CCBot, Common Crawl's harvesting agent, tops the blocklist at 234 sites. GPTBot at 211 is next, which is precisely the bot SmileBrands targets. ClaudeBot and Bytespider — not blocked by SmileBrands — follow immediately behind. The choice to block GPTBot while allowing Anthropic and ByteDance's crawlers is a selective posture, not a blanket AI exclusion.

Corpus-wide, 318 of 1247 sites block at least one AI crawler.

Corpus-wide, 343 of 1247 sites publish an llms.txt file.

Among the open dental sites, dentalcare.com and colgate.com publish llms.txt — the newer file that provides AI agents with a curated content map. Those two sites are not just passively open; they are actively steering AI retrieval toward the content they want agents to read.

How We Sealed the Dental Snapshot

These figures come from one point-in-time crawl of public robots.txt files, sealed June 21, 2026 under snapshot sha 1900f057e385d393. For each dental domain we fetched robots.txt at the root, parsed its user-agent and disallow directives, and recorded whether any AI crawler token was disallowed. We report verbatim counts; nothing is estimated, modeled, or extrapolated. The crawl spanned 1542 sites across 154 categories, of which 1247 returned a parseable file.

The counting rule is deliberately narrow. A block is an explicit Disallow aimed at a named AI agent — GPTBot, ClaudeBot, CCBot, and the other leaderboard tokens. A dental site can disallow patient portals, scheduling tools, or account pages without naming an AI agent, and that does not count as an AI block here. Only a directive that names one would move a site into the blocker column, which is why only smilebrands.com appears in the dental block count.

Each domain is read once, at seal time, exactly as it answered. That single-read rule is what makes the result content-addressable: anyone holding sha 1900f057e385d393 can re-derive the same 9 policied files and the same 1 blocker.

Frequently Asked Questions

Q: Which dental site blocks AI crawlers?

A: One: smilebrands.com, which blocks GPTBot specifically. The remaining 8 dental sites with a parseable robots.txt — asda.org, ada.org, dentalplans.com, dentalcare.com, colgate.com, aspen.dental, pacific-dental.com, and dentalworks.com — allow every named AI crawler.

Q: Why does SmileBrands block GPTBot but not other AI crawlers?

A: The robots.txt standard lets operators target specific user-agent strings. Blocking GPTBot specifically bars OpenAI's web-crawling agent — most often used for training data collection — while leaving retrieval agents from Anthropic, ByteDance, and others free to read and cite the site's content in live AI queries. It is a training exclusion, not a search exclusion.

Q: Does the 11.1% rate cover all the dental sites you checked?

A: No. It covers the 9 sites that returned a parseable robots.txt. Three more — heartlanddentalcare.com, greatlakesdental.com, and koolsmiles.com — produced no parseable file at the seal, so they are excluded from the rate.

Q: Does a Disallow in robots.txt actually stop an AI crawler?

A: Not by force. robots.txt is an honor-system standard: a cooperative crawler reads it and complies, but the file enforces nothing technically. Compliant crawlers from major operators like OpenAI respect the directive; uncompliant crawlers ignore it regardless of what the file says.

Put AI-Access Data to Work

For a dental DSO's digital or marketing lead — the person who owns how the organization appears in local and AI-assisted search — this snapshot is a baseline worth watching. Eight of 9 dental sites gate nothing, which means an answer engine fielding a question about a procedure, a provider, or a dental plan can reach those pages.

The one exception is narrow: a training-data block on OpenAI's crawler does not remove a site from AI-assisted local search, because the retrieval agents that answer live queries operate separately from the training pipeline. But the policy space is evolving — a site that blocks GPTBot today may extend that block to ClaudeBot or CCBot next quarter.

US Tech Automations runs exactly that kind of scheduled robots.txt crawl with change alerts and agentic monitoring, so a policy shift surfaces the week it lands rather than at the next annual site audit. That is the difference between reacting to a lost traffic channel and watching the channel before it closes.

See how the agentic monitoring works, and you have a standing read on dental AI-access posture instead of a one-time count — the same way a watcher tracks adjacent categories like the insurance sites that post a comparable 11.1% block rate or the marketing sites that land at 10%.

Key Takeaways

Of the 9 Dental sites with a parseable robots.txt, 1 blocks at least one AI crawler — an 11.1% block rate, well below the corpus average of 25.5%.
The single blocker is smilebrands.com, which blocks GPTBot specifically while leaving other AI crawlers unrestricted.
Eight sites — asda.org, ada.org, dentalplans.com, dentalcare.com, colgate.com, aspen.dental, pacific-dental.com, and dentalworks.com — allow every named crawler.
Three domains — heartlanddentalcare.com, greatlakesdental.com, and koolsmiles.com — returned no parseable file and are excluded from the rate.
Dental at 11.1% sits alongside Insurance and Marketing, and far below Healthcare (66.7%) and Accounting (50%).

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 21, 2026 (snapshot sha 1900f057e385d393).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Dental Sites Block AI Crawlers? 1 of 9 Do.” https://ustechautomations.com/resources/blog/do-dental-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 1900f057e385d393

Machine-readable data: CSV · JSON · All research & methodology