Research & Data

Do Staffing Sites Block AI Crawlers? Zero of 11 Do

Jun 21, 2026

A hiring manager who asks an AI assistant which staffing firm specializes in light-industrial temp workers in a given metro, or a job seeker who asks which agency places entry-level finance roles, is not doing a Boolean job-board search — they are querying a retrieval agent that reads the open web before it answers. Staffing firms live or die on matchmaking visibility: a firm invisible to workforce-routing systems loses a growing share of the market.

The robots.txt files of the world's largest staffing agencies and their industry associations reflect that reality exactly.

Zero of 11 Staffing sites with a parseable robots.txt block any AI crawler.

Of the 12 staffing domains we checked, 11 returned a parseable robots.txt — the root-level file that tells an automated agent which paths it may fetch — and not one of them disallows a named AI crawler. That is a 0% block rate. Every figure here is read straight from the sealed snapshot; nothing is estimated, modeled, or extrapolated.

There is no holdout to single out, no global giant or niche boutique that gates the leaderboard while the rest stay open. Every staffing site with a published policy leaves every named bot in. Against the corpus, where 318 of 1247 sites with a policy gate at least one crawler for a 25.5% rate, staffing sits at the floor — among the categories that block nothing at all.

What an Open Policy Means for Staffing

A robots.txt directive is a public request, and the staffing read is "request granted" across the board. The honest interpretation is that these firms behave like lead-generation businesses, not content publishers. The asset a Randstad or an Aerotek protects is its candidate database and client relationships — not the marketing content on its website, which is there specifically to be found, cited, and converted into an inquiry or an application.

That logic is the opposite of a category like the Tech category, where developer platforms and editorial publications wall off content they've invested heavily in creating. A staffing firm's job-family landing pages, industry specialization content, and employer-of-record guides are acquisition assets — walling them off from an answer engine would hide the very pages that route a workforce buyer to a first conversation.

Every policied staffing site in the set allows every named AI crawler.

A zero-block category is a cleaner signal than a one-blocker category ever is. When a single site gates, the number hinges on one decision that could flip next quarter. When no site gates, the posture is a shared norm — and in staffing that norm is uniform openness. For a category that has spent decades competing on the ability to be found and responsive, that norm makes business sense.

The Eleven Staffing Sites With a Public Policy

The 11 domains with a readable file span the global staffing industry: adecco.com, manpowergroup.com, randstad.com, roberthalf.com, kellyservices.com, aerotek.com, spherion.com, americanstaffing.net, expresspros.com, trueblue.com, and staffmark.com. Global giants, U.S. specialty firms, the American Staffing Association, and regional chains — and every one of them allows GPTBot, ClaudeBot, CCBot, Bytespider, and the rest of the named agents to read its pages.

That uniformity spans the full spectrum — from the largest global enterprises to regional workforce franchises. Every scale has concluded that AI-readable content is a business asset, not a liability.

One domain — staffingindustry.com — returned no parseable robots.txt at the seal. It is therefore silent: neither an allow nor a block, and excluded from the rate entirely. That is why the denominator is 11 rather than the 12 sites we checked.

Where Staffing Lands Against Other Categories

A 0% block rate places staffing at the zero-block floor of the ranking — wide open, with company. The focused window below shows selected categories near the floor beside staffing, verbatim from the sealed snapshot.

CategorySitesWith robots.txtBlock at least 1 crawlerBlock rate
Veterinary12800%
MedSpa121000%
Construction10600%

Staffing shares its zero with veterinary, med-spa, and construction — service verticals whose websites function as discovery and demand-generation channels. The contrast with the high-block end of the ranking shows where the divide runs.

CategorySitesWith robots.txtBlock at least 1 crawlerBlock rate
News20171482.4%
Tech1513969.2%
HR109222.2%

Staffing posts a 0% AI-crawler block rate.

The comparison with HR is notable. The broader HR-software and professional-association category blocks at 22.2%, reflecting a mix of content types that includes proprietary research, member-only guidance, and software documentation. Staffing — a category whose content is primarily job-family marketing and employer-of-record explanations — blocks at 0%. The content purpose, not the workforce adjacency, determines the posture.

Which Crawlers the Rest of the Web Blocks First

No staffing site gates a single bot, so the useful context here is corpus-wide: which agents get disallowed most broadly when a site does decide to close. The cut below shows the most-disallowed bots across all 1247 sites with a robots.txt, bot name first, count next.

BotSites disallowing (of 1247)Rate
CCBot23418.8%
GPTBot21116.9%
ClaudeBot20716.6%
Bytespider20316.3%
Meta-ExternalAgent17814.3%

CCBot, Common Crawl's agent, tops the corpus blocklist at 234 sites, with GPTBot and ClaudeBot just behind. Staffing names none of these — every token the broader web gates first is allowed across the category. The bots that other industries shut out are precisely the bots the staffing sites leave in, which is the whole story of a zero-block category in one table.

Corpus-wide, 318 of 1247 sites block at least one AI crawler.

Corpus-wide, 343 of 1247 sites publish an llms.txt file.

Among the policied staffing sites, aerotek.com publishes llms.txt — the newer file that hands an AI agent a curated map of what to read. The ratio is lower than in some categories, which likely reflects the industry's focus on structured job listings and talent databases that live behind authenticated portals rather than public web pages.

How We Sealed the Staffing Snapshot

These figures come from one point-in-time crawl of public robots.txt files, sealed June 21, 2026 under snapshot sha 1900f057e385d393. For each staffing domain we fetched robots.txt at the root, parsed its user-agent and disallow directives, and recorded whether any AI crawler token was disallowed. We report verbatim counts; nothing is estimated, modeled, or extrapolated. The crawl spanned 1542 sites across 154 categories, of which 1247 returned a parseable file.

The counting rule is deliberately narrow. A block is an explicit Disallow aimed at a named AI agent — GPTBot, ClaudeBot, CCBot, and the other leaderboard tokens. A staffing site can disallow its applicant-tracking system, client portals, or job-board scrapers without naming an AI agent, and that does not count as an AI block here. Only a directive that names one would move a site into the blocker column, which is why the staffing count is a clean zero: none of the 11 policied files names an AI agent in a disallow group.

Each domain is read once, at seal time, exactly as it answered. That single-read rule is what makes the result content-addressable: anyone holding sha 1900f057e385d393 can re-derive the same 11 policied files and the same zero blockers.

Frequently Asked Questions

Q: Which staffing site blocks AI crawlers?

A: None of them. All 11 staffing sites with a parseable robots.txt — adecco.com, manpowergroup.com, randstad.com, roberthalf.com, kellyservices.com, aerotek.com, spherion.com, americanstaffing.net, expresspros.com, trueblue.com, and staffmark.com — allow every named AI crawler.

Q: Why do staffing firms leave AI crawlers in?

A: Lead generation and candidate acquisition. A hiring manager researching staffing partners, a job seeker asking which firm places roles in their field, a workforce buyer comparing employer-of-record options — all of those queries increasingly run through AI assistants. Being readable keeps the firm's specialty pages, geographic footprint, and service descriptions in front of those queries. Blocking would hide the very content that converts a search into a first conversation.

Q: Does the 0% rate cover all the staffing sites you checked?

A: No. It covers the 11 sites that returned a parseable robots.txt. One more — staffingindustry.com — produced no parseable file at the seal, so it is excluded from the rate rather than counted as an allow or a block.

Q: Does a Disallow in robots.txt actually stop an AI crawler?

A: Not by force. robots.txt is an honor-system standard: a cooperative crawler reads it and complies, but the file enforces nothing technically. Since no staffing site publishes a disallow against an AI agent, the question is moot here — every policied staffing site signals that AI agents are welcome to read its paths.

Put AI-Access Data to Work

For a staffing firm's digital or marketing lead — the person who owns how the firm appears in client and candidate search — this snapshot is a baseline worth watching. The category gates nothing today, which means an answer engine fielding a question about workforce solutions, temp staffing, or permanent placement in a given specialty can reach your pages.

But a zero is only true at seal time: a new CMS default, a legal-compliance update, or a vendor security policy can quietly add a disallow that walls off the very answer engines your prospective clients and candidates now use. Knowing the week that happens is worth more than discovering it at the next annual site audit. US Tech Automations runs exactly that kind of scheduled robots.txt crawl with change alerts and agentic monitoring, so a policy shift surfaces the week it lands rather than at the next review.

A second fit is an AI-search or GEO analyst tracking which B2B service categories stay eligible to surface in answer engines. Their job is to know, continuously, whether the pages a firm relies on are still readable, and whether a silent domain is a timeout or a hardening stance. US Tech Automations monitors that drift across a watchlist of competitor and partner domains and routes the alert when a site flips.

See how the agentic monitoring works, and you have a standing read on staffing AI-access posture instead of a one-time count — the same way a watcher tracks adjacent categories like the HR sites that post a 22.2% block rate or the veterinary sites that also gate nothing.

Key Takeaways

  • Of the 11 Staffing sites with a parseable robots.txt, zero block any AI crawler — a 0% rate, at the very floor of the ranking.

  • There is no blocker to name: adecco.com, manpowergroup.com, randstad.com, roberthalf.com, kellyservices.com, aerotek.com, spherion.com, americanstaffing.net, expresspros.com, trueblue.com, and staffmark.com all allow every crawler.

  • One domain — staffingindustry.com — returned no parseable file at the seal and is excluded from the rate.

  • Staffing shares the zero-block floor with Veterinary, MedSpa, and Construction, and sits far below HR (22.2%), Tech (69.2%), and News (82.4%).

  • Corpus-wide, 318 of 1247 sites (25.5%) gate at least one crawler, so staffing sits well below the average at the open end.

Source: US Tech Automations Research — Closing Web edition; figures are verbatim counts from public robots.txt files sealed June 21, 2026 (snapshot sha 1900f057e385d393).

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “Do Staffing Sites Block AI Crawlers? Zero of 11 Do.” https://ustechautomations.com/resources/blog/do-staffing-sites-block-ai-crawlers-2026

Sealed snapshot sha256: 1900f057e385d393

Machine-readable data: CSV · JSON · All research & methodology

About the Author

Garrett Mullins
Garrett Mullins
Workflow Specialist

Helping businesses leverage automation for operational efficiency.