I was wrong about the AI bubble

Sort of

May 18, 2026

Six months ago I argued that AI was clearly a bubble and the question was only “of what type”. Something like the Great Financial Crisis, financed by debt that would eventually be called? Or more like an industrial bubble, with legit demand that drives capex and infrastructure buildup, but eventually leading to over capacity and a too wide gap between investment and demonstrated returns?

Six months later, nothing bubble-popping has materialized and so far I have been wrong. Everything has gotten actually bigger. Anthropic was valued at $183B in September 2025, as of May 2026 Bloomberg reports it is in talks to raise at a $900B valuation (5x increase). Big 4 capex (Google, Amazon, Microsoft, Meta) is projected to hit $725B in 2026, up from $410B in 2025 (1.8x increase), with analysts projecting $1T+ in 2027. Google’s market cap reached $4.8T in May 2026, up 37% in the last 6 months (Amazon and Meta have traded more sideways during this period, while Microsoft has been hit by the SaaSpocalypse and it’s down 16% since 6 months ago).

The bubble, by every macro indicator, is still inflating or it doesn’t exist. However, the unit economics are clearly changing and the subsidized era is ending.

All four major AI coding tools (GitHub Copilot, Claude Code, Cursor, OpenAI Codex) abandoned flat-rate “unlimited” pricing in favor of metered / token- / usage-based billing tied to underlying API costs. The shift to pass COGS to their users to create some margins that are not in negative territory is evident.

There is a certain convergence on the same three-tier structure: $20, $100, $200 per month, with Copilot still being the cheaper exception. With the price converging and stabilizing, and the unit underneath the price migrating from “requests” to tokens, the battlefield now is the amount of tokens that you get for a price tier set in stone.

Cursor moved first in 2025, took a hit, and ceded ground to Claude Code later in the year. The June 2025 pricing change is the best example of the subsidy-era ending poorly, with Anysphere replacing the legacy Pro plan’s “500 fast requests + unlimited slow requests” with a $20 credit pool consumed at API-equivalent rates. The change didn’t land well with the users, and CEO Michael Truell published a public apology, offering refunds for customers who switched to the new credit-pool model and found their bills had exploded without warning.

It’s not by chance that Cursor moved first. At that time, among the four major players it was the one without a proprietary model (like Copilot) and without the backing of infinite VC money or Microsoft deep pockets like the other three.

A single agentic session sending 350,000 input tokens through Claude Sonnet 4 costs about $1.35 at API rates. At the legacy “$0.04 per request” model that Cursor’s original pricing was built around, that one interaction is 34x what the user paid. The math was not mathing, clearly.

GitHub’s explanation for its April 2026 transition to usage-based billing was practically the same. From the announcement:

Today, a short chat question can cost the user just as much as an autonomous coding session lasting several hours. GitHub has absorbed much of the escalating inference cost behind that usage, but the current premium request model is no longer sustainable.

Satya Nadella, on Microsoft’s April earnings call, made the same point in more general terms:

Any per user business of ours, whether it’s productivity or coding or security, will become a per user and usage business.

Claude Code introduced session and weekly rate limits on August 2025. Anthropic cited extreme power-user behavior, with thousands of $ in model usage on a $200 plan. Anthropic was also less aggressive than OpenAI in their capacity buildup, they probably underestimated their stupendous growth and success, so curbing usage was also a way to manage the limited compute capacity and spread it across its fast growing customer base. The posture with users became more and more hostile as a result, and the brand reputation was hit. We will see if the xAI deal will give them some breathing space to go back to the goodwill they had in late 2025.

Anthropic and OpenAI are vertically integrated, they own the models used by their products (Claude Code and Codex), and they serve them at API rate to competitors. Middleware players (Cursor, GitHub Copilot) resell access to those models at retail API rates and try to build a margin on top. The problem with the middleware position is that your cost floor is set by a supplier who competes with you directly.

Cursor was reportedly paying approximately $650M annually to Anthropic against $500M in revenue, so a gross margin of -23% as of January 2026.

Aakash Gupta@aakashgupta

This is wild. Cursor hit $2.7 billion ARR last month. The quarter that ended in January, their gross margin was NEGATIVE 23% ❗️ That means for every dollar of revenue, Cursor paid Anthropic and OpenAI $1.23 in API costs. At a $2 billion run rate, that's nearly $2.5 billion a

2:10 AM · Apr 25, 2026 · 58.7K Views

34 Replies · 53 Reposts · 391 Likes

The model owners are already at a different altitude. Anthropic’s inference margins moved from 38% to 70% in twelve months. OpenAI’s compute margin went from 35% in early 2024 to 70% by October 2025. This is all self-reported because they are private companies, but assuming it’s true, they are on a path to SaaS-level economics.

Cursor’s response was to build its own model, launching Composer 2, a proprietary model sneakily built on Moonshot’s Kimi K2.5. The model was launched in March 2026 at roughly one-tenth of Opus 4.6’s effective token price, and by April 2026 Cursor had reached “slight gross-margin profitability” on enterprise accounts, while still losing money on individual developers. Copilot is moving in the same direction, building in house models and loosening the relationship with OpenAI1.

If Cursor and Copilot become less dependent on OpenAI and Anthropic, and start building their own models, the convergence would be complete: similar margins in the SaaS ballpark of 60%, similar price tiers with 3 plans and shrinking allowances, similar usage-based unit economics to pass COGS to users.

The capacity buildup was justified by the need to build distribution first, and by a bet that inference costs would fall fast enough to make the math work. The costs didn’t fall fast enough though, because agentic workflows arrived before the curves crossed, so the subsidy had to end earlier.

With flat-rate unlimited era coming to an end, the question is now whether this transition will be orderly or chaotic, like when a bubble pops.

The case for orderly transition is that the change has been telegraphed for months now, so it shouldn’t be a surprise. But the chaos risk can hide at the infrastructure financing level. The capex commitments stretching to 2030 have been made based on assumptions of continuous growth in API revenue that might not be true now. If users need to foot much larger usage-based bills, either the demand for AI is perfectly inelastic, or revenue will take a hit. If Anthropic’s revenue growth slows from “2x in 6 months” to “2x in 12 months”, the present value on those commitments shifts dramatically, and sentiments could change quickly.

Do you like this post? Of course you do. Share it on Twitter/X, LinkedIn and HackerNews

Fine-tuning a Chinese open-weight model is not exactly the same as having a Superintelligence team in-house full of ML reseachers from Google and Meta.

Cathy

I am definitely a layperson when it comes to this type of analysis, but I do have concerns about an AI bubble — one that you imply without fully exploring.

Your closing question about whether the transition will be orderly or chaotic is the one that keeps me up at night. Because the chaos risk isn't just at the infrastructure financing level. It's in the consumer foundation underneath it.

The capex buildout assumes continuous revenue growth. That revenue ultimately depends on users who can afford expanding metered usage bills now being passed through at API rates. But those users — businesses and consumers — are already stretched. Consumers are carrying grocery balances on credit cards. Business margins are compressing under tariffs. The subsidy era ended because the math didn't work at the infrastructure level. The demand era may be shorter than projected because the math doesn't work at the household level either.

I wrote about this from a consumer perspective — not a tech perspective — here: https://lakesidegrammy.substack.com/p/gilding-the-gdp?r=4psz66

2 replies by Better than Random and others

2 more comments...

Better than Random

Discussion about this post

Ready for more?