Research & Data

337 AI Models Priced Daily: The USTA AI Price Index — June 2026

Jun 12, 2026

The USTA AI Price Index launches with a simple census. On June 12, 2026, a single sealed snapshot of the OpenRouter marketplace recorded 337 models posted for sale by 53 providers. Every figure in this report comes from that one day — captured once, content-hashed, and stored before any analysis was run on it.

The scope is deliberately narrow: Models listed publicly on OpenRouter with their posted per-token prices, plus the Hugging Face trending list, as captured by US Tech Automations’ sealed daily AI-economics snapshots. This is a census of one marketplace’s listings, not of every AI model in existence. The index is brand new, and this edition is census-only — one day, no trends, no change claims.

The Index at a Glance

What does the posted price of intelligence look like on a single day? The headline answer is that it barely has a single shape. The marketplace carries everything from giveaway listings to listings priced like specialist consulting, and the launch census exists to pin that whole spread to one verifiable date rather than to a vague "as of recently."

337 models from 53 providers, priced in one sealed snapshot.

MeasureValue (sealed June 12, 2026)
Models listed337
Providers represented53
Paid listings307
Free listings26
Variable-priced listings4
Median paid prompt price (per million tokens)$0.42
Median paid completion price (per million tokens)$1.50

On June 12, 2026, the median paid listing posted $0.42 per million prompt tokens and $1.50 per million completion tokens.

Two structural facts stand out before any tiering. First, paid listings dominate the census: 307 of the 337 models carry a posted price, while 26 are listed free and 4 are priced variably. Free listings on a routing marketplace are best read as demand-generation surfaces — rate-limited tiers that let developers prototype against a model before committing paid traffic — not as production capacity.

Second, the completion side is priced well above the prompt side at the median. That asymmetry is consistent with how inference works: ingesting your prompt is comparatively cheap for the provider, while generating new tokens consumes the expensive sequential compute. Buyers who only compare prompt prices routinely misbudget, because workloads that produce long outputs pay the completion rate on every generated token.

The Price Tiers

To make the distribution legible, the index sorts paid models into three tiers by posted prompt price per million tokens. The cutoffs are our methodology choices, not boundaries the market draws for itself: the budget tier covers models at or below $0.50, the mid tier runs above that up to $5.00, and the frontier tier is everything more expensive.

TierDefinition (prompt price per million tokens)ModelsMedian prompt price
BudgetAt or below $0.50159$0.15
MidAbove $0.50, up to $5.00119$1.20
FrontierAbove $5.0029$10

159 of 307 paid models post prompt prices at or below $0.50.

The shape of this table is the single most useful fact in the launch edition. The bulk of the paid catalog sits in the budget tier, with a median posted price of $0.15 per million prompt tokens — commodity territory, populated heavily by small and open-weight models where many providers can serve the same artifact and compete on serving efficiency alone.

The frontier tier is the opposite story: a thin band of 29 models with a median of $10. That is where scarce, differentiated capability lives — the proprietary reasoning models that only their originating labs can serve. The market's message to buyers is blunt: routine capability is nearly free at the margin, and the premium is reserved for the small set of models that can do what the commodity floor cannot.

The budget-tier median is $0.15 per million prompt tokens; the frontier-tier median is $10.

For anyone building automated workflows, the practical read is a two-speed market. Default traffic — classification, extraction, routing, summarization at volume — belongs in the budget tier, where the median price makes per-task cost a rounding error. Frontier spend should be reserved, deliberately and per task, for the steps where reasoning quality visibly changes the outcome.

The Extremes

The edges of the census show just how wide the posted-price spectrum runs on a single day.

End of the marketModel (as listed)Prompt price per million tokensCompletion price per million tokens
Cheapest paid listinginclusionAI: Ling-2.6-flash$0.010
Most expensive listingOpenAI: o1-pro$150$600

The most expensive listing posts $150 prompt and $600 completion per million.

At the floor, inclusionAI: Ling-2.6-flash posts a prompt price of $0.010 per million tokens. A price like that only makes commercial sense at enormous volume: it targets buyers running pipelines where every individual call is trivial — deduplication, tagging, triage, first-pass filtering — and where the only question that matters is unit cost. At this end of the market, the model is plumbing.

At the ceiling, OpenAI: o1-pro posts $150 per million prompt tokens and $600 per million completion tokens. Note that the completion premium persists at the top of the market, and steeply so. Buyers at this end are paying for deliberate, compute-heavy reasoning on low-volume, high-stakes work — the kind of task where a single materially better answer justifies the line item. The two ends of the table are not competing products; they are different purchases that happen to share an API shape.

Context Windows

Price is one axis of the census; advertised capacity is the other.

MeasureValue (sealed June 12, 2026)
Median advertised context window200,000 tokens
Largest advertised context window10,000,000 tokens (Meta: Llama 4 Scout)
Methodology floor for "million-plus"1,000,000 tokens
Models at or above the floor73

73 listings advertise context windows at or above 1,000,000 tokens.

The median listing advertises a 200,000-token window — already enough to hold a substantial codebase slice, a contract stack, or a long multi-turn agent session in a single call. Above the index's 1,000,000-token methodology floor sit 73 models, and the maximum advertised window in the census is 10,000,000 tokens, posted by Meta: Llama 4 Scout.

What is long context actually for? In practice: whole-repository code work, due-diligence document sets, retrieval-free question answering over large corpora, and agent runs that accumulate state instead of summarizing it away. Two cautions belong next to the numbers, though. An advertised ceiling is a listing attribute, not a quality guarantee — recall and reasoning often degrade well before the window fills. And context is priced per token, so a habit of filling giant windows turns the cheapest model into an expensive one. The census records what providers post; it does not test what they deliver.

Alongside prices, each daily snapshot seals the Hugging Face trending list. On June 12, 2026, the top of that list read:

  1. google/diffusiongemma-26B-A4B-it

  2. nvidia/LocateAnything-3B

  3. google/gemma-4-12B-it

A plain caveat first: trending rank is popularity on a different surface — developer attention on Hugging Face — not a price signal, and it says nothing about revenue or production usage. We seal it because attention is upstream of everything else in this market: the models developers download and discuss are disproportionately the ones that later show up as cheap, widely served listings.

Read qualitatively, the list is open-weight territory top to bottom. The labels point at instruction-tuned releases from major labs alongside a specialized localization model — the kind of artifacts developers can pull, run, and fine-tune themselves. That is a different economy from the frontier tier in the price tables: attention flows to what is open and runnable, while posted-price premiums attach to what is closed and served.

Methodology

Source: OpenRouter public model listing and Hugging Face trending list, via our AI-economics clock (sealed daily, content-hashed).

All figures are computed directly from US Tech Automations’ sealed daily AI-economics snapshots; nothing is estimated, modeled, or extrapolated.

The clock works the same way as the rest of our sealed-snapshot research, including the permit prediction ledger and the Los Angeles building permit report — the discipline is the product, applied here to a different market:

  1. Collect. Once a day, the clock captures the public OpenRouter model listing and the Hugging Face trending list exactly as published.

  2. Normalize. Posted per-token price strings are preserved verbatim as the source publishes them; conversion to per-million-token figures happens in code, never by hand.

  3. Seal. Each snapshot is content-hashed and appended to an immutable store, so any figure in this report can be re-derived from the sealed artifact.

  4. Aggregate. Edition statistics are computed from the sealed snapshot alone — no external estimates, no backfilled assumptions.

One rule governs this launch edition above all: the census-only rule. This index is new, and this report describes exactly one sealed day. No change, growth, or trend claims will appear until the clock holds multiple monthly observations to compare honestly. When trend editions do arrive, every point in them will trace back to a sealed, hashed daily snapshot like this one.

Frequently Asked Questions

Q: Why are prices quoted per million tokens?

A: Because that is the unit the market quotes and budgets in. The median paid listing posts $0.42 per million prompt tokens and $1.50 per million completion tokens; quoting per million keeps small prices readable and makes listings directly comparable. The clock stores the verbatim per-token strings the source publishes and converts to per-million in code, so the published unit never drifts from the sealed evidence.

Q: Why does the index cover only one marketplace?

A: Because a census has to have a knowable boundary. OpenRouter aggregates posted prices from many providers in one public listing, which makes it a clean, verifiable surface to seal daily. The index counts what that marketplace lists — 337 models from 53 providers on this snapshot — and claims nothing about models priced elsewhere or not listed publicly at all.

Q: What do the free listings actually mean?

A: The census recorded 26 free listings. On a routing marketplace, free tiers are typically rate-limited promotional surfaces: providers absorb the cost so developers can evaluate a model before sending it paid traffic. They are useful for prototyping and unsuitable for production commitments, which is why the index reports them as their own category rather than folding them into the paid statistics.

Q: When will the index show trends?

A: Only after the clock holds multiple monthly observations. The snapshot is sealed daily, so the raw material accumulates automatically, but this launch edition describes exactly one day — June 12, 2026 — and makes no claim about any other day. Publishing the census first, before any trend exists to report, is deliberate: it fixes a verifiable baseline that later editions can be checked against.

Q: How were the price tiers chosen?

A: The cutoffs are our methodology choices, stated in the open: the budget tier is at or below $0.50 per million prompt tokens, the mid tier runs up to $5.00, and the frontier tier is everything above. The market does not draw these lines itself. Fixing them publicly now means tier counts in future editions will be comparable by construction rather than by editorial judgment.

Put Cost Data to Work

US Tech Automations builds AI automations for businesses, and model selection is a line-item cost decision in every one of them. The spread documented above — a budget tier with a $0.15 median sitting under a frontier tier with a $10 median — is the difference between an automation that pays for itself and one that quietly burns margin. We route routine steps to commodity-priced models and reserve frontier spend for the steps where reasoning quality changes the outcome, and this index is the sealed baseline we price that against.

It is also the same research discipline we apply to physical-economy data, like the San Francisco building permit report: freeze a public source daily, hash it, and only publish numbers that trace back to a sealed artifact. If you want help reading the cost side of an AI workflow — or want an automation built with model spend treated as an engineering constraint rather than an afterthought — talk to us.

Source: US Tech Automations Research — computed from the sealed daily AI-economics snapshot, June 12, 2026.

Get this data as a daily feed

The numbers in this report come from a permit feed we monitor daily. Leave your email and we will follow up about a daily feed for your ZIPs and categories.

Prefer to talk first? Contact us.

Cite this report

US Tech Automations Research, 2026-06 edition. “337 AI Models Priced Daily: The USTA AI Price Index — June 2026.” https://ustechautomations.com/resources/blog/usta-ai-price-index-june-2026

Sealed snapshot sha256: f8f2f0f41548d556ac79279c1d758c419d65b57f1516527929859dae8e8fb5ad

Machine-readable data: CSV · JSON · All research & methodology

About the Author

Garrett Mullins
Garrett Mullins
Workflow Specialist

Helping businesses leverage automation for operational efficiency.