Research & Data

Do Furniture Sites Block AI Crawlers? 1 of 7 Do

Jun 19, 2026

Furniture brands have spent the last decade moving the showroom online — high-resolution photography, room planners, fabric swatches, delivery calculators. That catalog is the whole pitch, and the AI assistants now answering "best sofa for a small apartment" read it the same way a shopper does. So the question of whether furniture sites wall off those answer engines is really a question about how much of the catalog the category wants found.

1 of 7 Furniture sites block at least one AI crawler.

Of the furniture domains we checked, 7 returned a parseable robots.txt — the root-level file that tells automated agents which paths they may fetch — and a single one of those disallows an AI crawler. That works out to a 14.3% block rate. Every figure here is read straight from the sealed snapshot; nothing is estimated, modeled, or extrapolated.

The lone blocker is ashleyfurniture.com, and even it is a narrow gate rather than a wall. Against the corpus, where 317 of 1203 sites with a policy gate at least one crawler for a 26.4% rate, furniture sits well under the average — one of the more open retail categories in this edition.

The One Furniture Site That Gates, and the Six That Do Not

What makes furniture distinctive is not how many block, but how little the single blocker actually closes. ashleyfurniture.com is the only gate in the set, and its robots.txt names exactly one agent: FacebookBot. That is Meta's link-preview and social crawler, not one of the high-volume training crawlers like GPTBot or CCBot. Ashley's file leaves OpenAI, Anthropic, Common Crawl, and the rest of the answer-engine leaderboard untouched. In practice, the category's busiest national retailer is still readable by the AI assistants its customers are most likely to ask.

The open furniture sites are a cross-section of the modern direct-to-consumer and showroom world: westelm.com, article.com, roomandboard.com, potterybarn.com, joybird.com, and interiordefine.com. None of them disallows an AI agent — the same wide-open stance as the aquarium category, where not one site gates a crawler. A furniture brand runs on discovery — the product page, the room inspiration, the dimensions and materials are meant to be found, cited, and surfaced, including by an AI assistant fielding a question about a sectional or a dining table.

The only furniture blocker in the set is ashleyfurniture.com, and it names just FacebookBot.

Three furniture domains — cb2.com, crateandbarrel.com, and floyddetroit.com — returned no parseable robots.txt at the seal. They are therefore silent: neither an allow nor a block, and excluded from the rate entirely. That is why the denominator is 7 rather than the 10 sites we checked. It would be wrong to read silence as a stance; it is an artifact of how a host answered at one moment in time.

What This 14.3% Block Rate Actually Means

A building permit is a public record; a robots.txt directive is a public request — and the furniture read is almost entirely "request granted." The honest interpretation is that, as a category, furniture brands behave far more like open publishers than like data fortresses. The catalogs they have spent years photographing and describing are an outreach asset rather than a competitive moat, so keeping them readable by retrieval agents extends reach rather than threatening it.

Ashley is the instructive exception, and even it is a soft one. Naming FacebookBot is a social-and-preview decision, not an anti-AI-training stance — the kind of directive a large retailer adds for reasons that have little to do with answer engines. That single narrow gate is the entire furniture block rate. In a seven-file sample, one partial blocker is enough to put a number on the board, and it lands the category at 14.3%.

The small sample sharpens this rather than weakening it. With seven policied files, the read is really a story about ten named brands and one limited decision at ashleyfurniture.com. The finding: in furniture, AI-access posture is not set by a broad wave of gating but by whether the largest retailers choose to close — and here, the largest one barely did.

Furniture sites post a 14.3% AI-crawler block rate.

This is a different shape of story than the most-gated categories in the edition. Where gaming sites overwhelmingly block AI crawlers because their content is the product, furniture treats its catalog as a reason to be visited. The contrast is the point: a 26.4% corpus average hides categories that range from catalog-as-outreach to data-as-asset, and furniture sits firmly on the outreach side.

Where Furniture Sits Among Similar Categories

A 14.3% block rate places Furniture in the lower-middle of the ranking — open, but not at the zero-block floor. The focused window below shows Furniture beside its nearest neighbors, verbatim from the sealed snapshot, name first and no rank column.

CategorySitesWith robots.txtBlock at least 1 crawlerBlock rate
Pharma98112.5%
Cigars107114.3%
Education97114.3%
Grocery107114.3%
Luggage107114.3%
Sailing77114.3%

Furniture shares its 14.3% reading with a broad, unglamorous band — Cigars, Education, Grocery, Luggage, and Sailing all land on the same single-blocker mark. It is a crowded part of the ranking, which is itself a sign that one-in-seven is a common posture: most sites in these categories want to be readable. The extremes show what the ends look like:

CategorySitesWith robots.txtBlock at least 1 crawlerBlock rate
Gaming99888.9%
News20171482.4%
Hotels10300%
FastFood10600%

Furniture sits far below Gaming and News, and a notch above the zero-block floor that hotel chains define with their open policies. The category is open by disposition, barely gated by exception.

The Bots the Corpus Reaches For First

The single furniture blocker names only FacebookBot, so the more useful context is which bots get gated most broadly across the whole edition — the tokens a site names first when it does decide to close. The cut below shows the most-disallowed bots across all 1203 sites with a robots.txt, bot name first, count next.

BotSites disallowing (of 1203)Rate
CCBot23419.5%
GPTBot21017.5%
ClaudeBot20717.2%
Bytespider20316.9%
Meta-ExternalAgent17814.8%

CCBot, Common Crawl's agent, tops the corpus blocklist at 234 sites, with GPTBot and ClaudeBot close behind. Notably, ashleyfurniture.com names none of these — it gates FacebookBot, a social crawler that does not even make the corpus top five. That gap is the whole furniture story: the one blocker in the category is not gating the high-volume training crawlers the rest of the corpus gates first.

Corpus-wide, 317 of 1203 sites block at least one AI crawler.

How the Furniture Snapshot Was Sealed

These figures come from one point-in-time crawl of public robots.txt files, sealed June 19, 2026 under snapshot sha 040215878ac7b85a. For each furniture domain we fetched robots.txt at the root, parsed its user-agent and disallow directives, and recorded whether any AI crawler token was disallowed. We report verbatim counts; nothing is estimated, modeled, or extrapolated. The three domains with no parseable file — cb2.com, crateandbarrel.com, and floyddetroit.com — are logged as silent, neither allow nor block.

The counting rule is deliberately narrow. A block is an explicit Disallow aimed at a named agent — FacebookBot at Ashley, or any of the GPTBot, ClaudeBot, CCBot leaderboard tokens elsewhere. A furniture site can disallow cart, search, or account paths without naming such an agent, and that does not count as an AI block here. Only a directive that names one moves a site into the blocker column, which is why the furniture count is a clean 1: ashleyfurniture.com names FacebookBot, the rest name nothing.

Each furniture domain is read once, at seal time, exactly as it answered. That single-read rule is what makes the result content-addressable: anyone holding sha 040215878ac7b85a can re-derive the same seven policied files and the same one blocker. The cost is that three silent hosts land in the excluded bucket rather than the allow column — the method favors reproducibility over a generous reading.

Frequently Asked Questions

Q: Which furniture site blocks AI crawlers?

A: ashleyfurniture.com, and only narrowly. It is the one of the 7 furniture sites with a parseable robots.txt that disallows a named crawler, but the agent it names is FacebookBot — Meta's social and link-preview bot — not a high-volume answer-engine crawler like GPTBot or CCBot. That single gate is the entire 14.3% block rate.

Q: Why do furniture brands leave AI crawlers in?

A: Discovery. westelm.com, article.com, roomandboard.com, potterybarn.com, joybird.com, and interiordefine.com all run on public reach — their product pages, dimensions, and room inspiration are meant to be found and cited, including by AI assistants. For a brand whose catalog is the pitch, being readable extends the pitch rather than threatening it.

Q: Does the 14.3% rate cover all the furniture sites you found?

A: No. It covers the 7 sites that returned a parseable robots.txt. Three more — cb2.com, crateandbarrel.com, and floyddetroit.com — produced no parseable file at the seal, so they are excluded from the rate rather than counted as an allow or a block. Two of the open sites, westelm.com and potterybarn.com, additionally serve an llms.txt file.

Q: Does a Disallow in robots.txt actually stop an AI crawler?

A: Not by force. robots.txt is an honor-system standard: a cooperative crawler reads it and complies, but the file enforces nothing technically. ashleyfurniture.com signals that FacebookBot should stay out of certain paths; each crawler decides whether to honor that request.

Put AI-Access Data to Work

For a furniture brand or e-commerce owner — the person who owns how a catalog appears online — this snapshot is a baseline worth watching. Most peers stay fully open while Ashley gates only a social bot, and that mix can shift the week a new legal or marketing policy lands.

The risk that should keep a brand owner up is the accidental one: a blanket Disallow added during a site migration that quietly walls off the exact answer engines customers now ask "which sofa fits a small living room." US Tech Automations runs scheduled robots.txt crawls with change alerts, so a self-inflicted block surfaces the week it lands rather than after a quarter of lost AI-answer visibility.

A second fit is an AI-search or GEO analyst tracking which retail peers remain eligible to surface in answer engines. Their job is to know, continuously, whether the catalog pages they rely on are still readable, and whether a cb2.com-style silence is a timeout or a hardening stance. US Tech Automations monitors that drift across a watchlist of domains with agentic monitoring and routes the alert when a brand flips, so the analyst is not re-checking files by hand. See how the agentic monitoring works, and you have a standing read on furniture AI-access posture instead of a one-time count.

Corpus-wide, 330 of 1203 sites publish an llms.txt file.

Key Takeaways

  • Of the 7 Furniture sites with a parseable robots.txt, 1 blocks at least one AI crawler — a 14.3% rate, well below the corpus average.

  • The only blocker is ashleyfurniture.com, and it names just FacebookBot — a social crawler, not a high-volume training bot like GPTBot or CCBot.

  • The open furniture sites — westelm.com, article.com, roomandboard.com, potterybarn.com, joybird.com, and interiordefine.com — allow every crawler, and two of them serve an llms.txt file.

  • cb2.com, crateandbarrel.com, and floyddetroit.com returned no parseable file at the seal and are excluded from the rate.

  • Corpus-wide, 317 of 1203 sites (26.4%) gate at least one crawler, so furniture sits well under the average.

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 19, 2026 (snapshot sha 040215878ac7b85a).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Furniture Sites Block AI Crawlers? 1 of 7 Do.” https://ustechautomations.com/resources/blog/do-furniture-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 040215878ac7b85a

Machine-readable data: CSV · JSON · All research & methodology

About the Author

Garrett Mullins
Garrett Mullins
Workflow Specialist

Helping businesses leverage automation for operational efficiency.