Research & Data

Do Whiskey Sites Block AI Crawlers? 3 of 8 Do

Jun 14, 2026

Whiskey media is split on artificial intelligence. Of the 10 Whiskey sites we checked, 8 returned a parseable robots.txt, and 3 of those block at least one AI crawler — a 37.5% block rate. The rest leave their tasting notes and buying guides open to every bot we track.

That puts Whiskey above the corpus line. Where roughly a third of all sites push back on AI, the Whiskey vertical pushes back a little harder — and the divide runs cleanly between editorial publishers and commerce-driven catalogs.

3 of 8 Whiskey sites block at least one AI crawler.

This is a point-in-time reading from a sha256-sealed snapshot of public robots.txt files, captured 14 June 2026. A robots.txt file is the plain-text policy a site publishes at its root to tell crawlers which paths they may fetch. We report only what each site declared. We did not infer intent or estimate behavior.

Who Gates the Crawlers Here

Three Whiskey sites disallow at least one AI crawler: whiskyadvocate.com, vinepair.com, and diffordsguide.com. All three are editorial properties — review hubs and cocktail references whose archives are their product. Gating AI is consistent with publishers who treat written content as the asset to protect.

The allowers run the other way. thewhiskyexchange.com, distiller.com, whiskeyraiders.com, caskers.com, and flaviar.com all keep the door open — a mix of marketplaces and discovery tools that benefit when an assistant can read and surface their inventory. Two sites, masterofmalt.com and breakingbourbon.com, returned no robots.txt at all.

The clean way to read this is by business model. The blockers are publications: their value is the words. The allowers are largely stores and apps: their value is the transaction the words lead to. distiller.com, a tasting-and-recommendation tool, wants to be the source an assistant pulls a pairing from; caskers.com and flaviar.com want a shopper's query to surface their shelves. Openness is not indifference for them — it is distribution.

Whiskey SiteReturned robots.txtBlocks any AI crawler
whiskyadvocate.comYesYes
vinepair.comYesYes
diffordsguide.comYesYes
thewhiskyexchange.comYesNo
distiller.comYesNo
caskers.comYesNo
flaviar.comYesNo
masterofmalt.comNo

Of 8 Whiskey sites with a robots.txt, 3 block at least one AI crawler and the rest stay open.

What This 37.5% Block Rate Actually Means

The pattern is editorial-versus-commerce, not pro-AI versus anti-AI. The three blockers are publications; the open sites lean toward retail and product discovery. A marketplace gains when an AI shopper can read its catalog; a review magazine loses when a model ingests its archive and answers the question the magazine wanted the reader to land on.

That tension is sharper in spirits than in many verticals because the editorial brand IS the buying guide. When tasting notes and rankings are the draw, protecting them in robots.txt is a defensible call.

Look at the same dynamic across drinks and it gets sharper still. Wine, the nearest sibling vertical, sits just below Whiskey at 33.3% — 3 of 9 blocking — and Tea sits at the very floor with 0 blockers. The gradient tracks how editorial each category is: the more a vertical's identity rests on original written criticism, the more of its sites gate. Spirits, with a dense layer of review publications, gates more than a retail-heavy beverage category does.

The 3-of-8 reading also hides a brand-size pattern worth naming. The blockers here are recognizable editorial names, not the marketplaces. In Whiskey, the sites with the most to protect are precisely the ones doing the protecting — the opposite of what happens in some sport verticals, where the biggest brands stay open and smaller outlets gate.

Whiskey sites post a 37.5% AI-crawler block rate.

Note the two sites with no robots.txt at all. A missing file is not a block — those sites simply published no policy, so a compliant crawler treats their paths as open by default. Absence of a rule reads as permission.

This is why the denominator matters. We report 3 blockers against the 8 sites that returned a parseable robots.txt, not against all 10 checked. The two sites without a file are not "allowers" in the sense of having declared anything; they are sites that took no position. Reading the rate as 3 of 8 keeps the comparison apples-to-apples with categories where coverage is fuller, like Golf, which also returned 8 parseable files.

How Whiskey Compares to Its Nearest Neighbors

Whiskey sits in a crowded band at 37.5%, sharing the exact rate with several categories directly around it. Below is a focused window of Whiskey and its closest neighbors in the ranking. For a vertical well below this line, see the Skiing report; for one at the floor, the Tea report.

CategorySites With robots.txtSites BlockingBlock Rate
HomeGarden9444.4%
Fashion7342.9%
Jobs8337.5%
Aviation8337.5%
Comics8337.5%
Whiskey8337.5%
Golf8337.5%
Travel9333.3%
Wine9333.3%

Whiskey and Golf land on identical numbers — 3 of 8 blocking. Tellingly, the adjacent beverage vertical, Wine, sits just below at 33.3%, while Tea is far down at the floor. Even within drinks, the willingness to gate AI varies with how editorial the category is.

Whiskey and Golf post the same reading: 3 of 8 sites blocking, a 37.5% rate.

Which Bots Get Disallowed Most Across the Corpus

Zooming out, the whole snapshot covers 725 sites, of which 614 returned a parseable robots.txt and 141 sites (23%) also published an llms.txt. Across that corpus, 196 of 614 sites block at least one AI crawler — the 31.9% line Whiskey clears. The disallows cluster on a familiar set of bots.

AI BotSites Disallowing (all 614 sites)
CCBot145
ClaudeBot124
GPTBot121
Bytespider118
Meta-ExternalAgent105

When a publisher like whiskyadvocate.com decides to gate, these are the bots it most likely names first. CCBot — Common Crawl's harvester — tops the list at 145 sites because so many models train on its archive. ClaudeBot and GPTBot follow closely, the two best-known assistant crawlers, which a content owner protecting its archive would name early.

Corpus-wide, 196 of 614 sites block at least one AI crawler.

The 23% llms.txt rate adds nuance. An llms.txt file lets a site express usage preferences beyond robots.txt's blunt allow-or-block. Of the 614 sites with a robots.txt, 141 published one. A Whiskey publisher serious about gating would plausibly reach for both files — robots.txt to block the harvesters, llms.txt to spell out terms for the rest.

How the Snapshot Was Sealed

The method is designed to be checkable. We fetched each site's public robots.txt from its root, parsed the disallow directives, and recorded which AI user-agents each site gates. The full capture is content-hashed into one sha256-sealed snapshot, sha 77d0521dc8809a6c, dated 14 June 2026, so the figures are fixed and citable rather than a moving live read.

For Whiskey, nothing is estimated, modeled, or extrapolated: the 3 blockers, the 8 parseable files, and the 37.5% rate are verbatim counts. Two honest caveats apply. robots.txt is a stated policy, not enforcement — a non-compliant bot can ignore it. And this is a 10-site sample of prominent Whiskey properties, not a census; the value of re-sealing is to catch the day an open site closes or a blocker reopens.

Key Takeaways

  • Of 10 Whiskey sites checked, 8 returned a robots.txt and 3 block at least one AI crawler — a 37.5% rate.

  • The three blockers — whiskyadvocate.com, vinepair.com, diffordsguide.com — are all editorial properties.

  • Marketplaces and discovery tools like flaviar.com and distiller.com stay open.

  • Whiskey clears the 31.9% corpus-wide block rate and ties Golf at 3 of 8.

  • Across all 614 sites, CCBot is the most-disallowed bot at 145 sites.

Frequently Asked Questions

Q: Does blocking a crawler in robots.txt actually stop it?

A: No. robots.txt is an honor-system standard; compliant crawlers respect it, but it is a request, not enforcement. We report what each site declares, not whether every bot obeys.

Q: Why do the Whiskey blockers skew editorial?

A: All 3 blockers — whiskyadvocate.com, vinepair.com, diffordsguide.com — are publications whose archives are their product. Gating AI protects the written content that drives their traffic, while commerce sites gain from staying discoverable.

Q: What does it mean that 2 Whiskey sites had no robots.txt?

A: masterofmalt.com and breakingbourbon.com published no robots.txt at all. A missing file is not a block — compliant crawlers treat the absence of a rule as permission to fetch.

Q: How would a contractor or analyst act on this snapshot?

A: Treat the 37.5% rate as a baseline and watch for drift. If an open site like caskers.com later adds an AI-operator disallow, that change — not the static count — is the actionable signal.

Q: Why does Whiskey gate more than Wine or Tea?

A: Whiskey blocks at 37.5%, Wine at 33.3%, and Tea at 0%. The gradient tracks how editorial each category is. Spirits carry a dense layer of review publications whose archives are their product, and those are the sites that gate.

Q: Which AI operators do the Whiskey blockers most likely name?

A: Across all 614 sites, the most-disallowed bots are CCBot at 145 sites, then ClaudeBot and GPTBot. A publisher like whiskyadvocate.com protecting its archive would name the harvesters and major assistant crawlers first.

Put AI-Access Data to Work

A whiskey-marketplace catalog manager can use this reading to protect discovery: re-crawl the 8 Whiskey sites weekly and alert the moment a peer marketplace like thewhiskyexchange.com or flaviar.com flips a previously open path to disallow an AI shopper, because losing assistant visibility quietly costs consideration-stage traffic. A competitive-intelligence lead at a spirits publisher can track whether rivals such as vinepair.com tighten or relax their AI gating over time, treating each new disallow as a read on where the category thinks the value of its archive is heading.

An AI retrieval product manager building a spirits recommender has the inverse job: confirm that the open sources it depends on — distiller.com, caskers.com — stay fetchable before wiring them into a pipeline, and get alerted the day one closes so the recommender does not silently lose a source. In all three cases the static 37.5% is just the anchor; the value is catching drift from it the day it happens.

US Tech Automations automates that watch as a scheduled job: recurring robots.txt and llms.txt crawls, change alerts, and an AI-access policy dashboard that flags every new disallow token. See how the agentic workflow platform runs it.

Every figure here comes from the same sealed-snapshot discipline US Tech Automations applies across this edition — nothing is estimated, modeled, or extrapolated; the counts are verbatim from public robots.txt files. Compare the open end of the spectrum in the Tea report.

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 14, 2026 (snapshot sha 77d0521dc8809a6c).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Whiskey Sites Block AI Crawlers? 3 of 8 Do.” https://ustechautomations.com/resources/blog/do-whiskey-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 77d0521dc8809a6c

Machine-readable data: CSV · JSON · All research & methodology

About the Author

Garrett Mullins
Garrett Mullins
Workflow Specialist

Helping businesses leverage automation for operational efficiency.