For the past 18 months, getting access to state-of-the-art AI models felt almost free. Students could experiment with Claude Opus. Developers could prototype with Gemini. Indie hackers could build side projects on pennies. This wasn't accident—it was strategy. But the era is ending faster than expected, and the consequences will reshape who can afford to build with AI.
The Current Reality
Late 2025 and early 2026 marked the beginning of the end. Google tightened Gemini API quotas. Anthropic restricted access to its most capable models in free tiers. GitHub stopped offering premium Claude models to students. OpenAI's free tier access became more limited. These weren't random policy tweaks—they were coordinated retreats from an unsustainable economics game that AI labs played to capture market share.
The stated reasons are always similar: "improving service quality," "preventing abuse," "prioritizing paying customers." But the real driver is cost. And understanding why requires looking at the gap between token pricing and actual inference expenses.
The Economics Nobody Talks About
Per-token pricing has indeed fallen dramatically. Anthropic's API costs less per token than it did two years ago. But this creates a false impression of cheapness. The real problem isn't price per token—it's tokens per request.
In 2023, a typical API call consumed 500–2,000 tokens. Today, a single complex request (multi-step reasoning, tool calls, long context, agentic loops) easily consumes 10,000–50,000 tokens. What looks cheap on a per-token basis becomes expensive at scale. A free-tier user who submits a dozen complex requests daily might generate $10–20 in GPU costs, while contributing zero revenue.
And the free tiers were never designed to be sustainable. They were designed to convert users—but conversion rates from free to paid are shockingly low. Most free users never upgrade. Meanwhile, the compute costs compound. For companies burning cash on inference, this math eventually forces a reckoning.
Average tokens per 2023 request
500–2K
Average tokens per 2026 request
10K–50K
Free-to-paid conversion rate (estimated)
< 5%
Monthly GPU cost per high-volume free user
$10–50+
Why Data Alone Isn't Enough
Some argue that free users provide value through usage data for model training. This is partially true. But training data has diminishing returns. Anthropic, OpenAI, and Google already have more data than they can efficiently process. And the free-tier users generating the most tokens (hobbyists, students, indie hackers) generate data that's often low-quality, repetitive, or off-distribution.
Enterprise customers—who pay $1,000+ monthly or maintain dedicated accounts—contribute both revenue and higher-quality usage patterns. From a business perspective, serving them is obviously more rational than subsidizing a thousand free users.
The Strategic Gamble Failed
The subsidization strategy was a "land grab"—offer cheap/free access, build network effects, lock users in. But this assumed conversion would be predictable. It wasn't. Many users built entire projects on free APIs, created zero business value, and became dependent on subsidies. When the free tier got restricted, they moved on—they didn't upgrade to paid.
Google's Gemini API subsidization arguably failed the hardest. The company spent heavily on free quotas, but failed to convert users to Google Cloud billing at meaningful rates. Anthropic learned faster and adjusted sooner. OpenAI's strategy was always more measured—free tiers were limited from the start.
Insight
The inevitable correction
Subsidization was a competitive tactic, not a business model. It only works if conversion rates justify the acquisition cost. When they don't, the subsidy becomes a sunk cost that companies can no longer afford to defend.
What Comes Next
The collapse of free/cheap AI access has three immediate effects:
-
Skill bifurcation — Students and hobbyists in developed markets can still access good tools (they have credit cards). Students in emerging markets cannot. Innovation will concentrate among the already-advantaged.
-
Consolidation pressure — Smaller AI startups that relied on affordable third-party APIs now face higher margins pressure. This favors large, self-sufficient labs that can amortize costs across many products.
-
Open-source acceleration — As cloud APIs become expensive, interest in local models, smaller efficient models, and open-source alternatives will spike. This could democratize AI in unexpected ways.
The economic reality is clear: inference is expensive, and subsidies were always temporary. The question now isn't whether pricing will rise—it will. The question is whether it rises fast enough to fund better models, or slow enough to maintain reasonable access. That balance determines whether AI remains a tool available to builders everywhere, or becomes a luxury for those who can afford it.