The Pulse: Engineering Departments Start Squeezing Their AI Bills

The Pulse: A Trend of Trying to Cut Back on AI Spend Within Eng Departments?

TLDR: Gergely Orosz reports a growing pattern across companies of every size: engineering organizations are starting to question their ballooning AI coding-tool bills. The interesting twist is that the cost-cutting pressure is often coming from the bottom up, from engineers themselves, not from executives waving a budget spreadsheet. The reason it stings is that nobody has cleanly tied that spend to shipped features or revenue yet.

Summary: This issue opens with a moment that says a lot about where we are. Uber's president Andrew McDonald was on a podcast and basically admitted the emperor has no clothes, or at least no clearly visible clothes. He pointed out that every company loves to quote stats like "twenty-five percent of our code commits last quarter were AI-driven," and those numbers sound astronomical. But when he goes and asks his senior engineering leaders the harder question, which is how many projects that were sitting on the cutting room floor actually got shipped because of those productivity gains, the link just isn't there yet. Maybe more is getting shipped implicitly, he says, but you can't draw a clean line from "Claude Code wrote a quarter of our commits" to "we delivered these specific useful features to users." I appreciate the honesty here, because that gap between impressive activity metrics and actual outcomes is exactly the thing most leadership decks skip over.

Then comes the line that went viral. Uber's CTO, Praveen, said in an interview that the company had blown through its entire 2026 AI budget by the middle of March. Read that again. Three months into the year, the money was gone. McDonald's takeaway is that engineering organizations are now going to have to talk openly about token consumption and cost versus headcount, and start making real tradeoffs. His framing is sharp: if you're just a user sitting there dreaming up clever use cases and you never see the bill, AI feels free. But somebody is paying that bill, and increasingly that somebody is asking pointed questions.

Orosz then does the thing he does well, which is call up actual practitioners instead of theorizing. The picture that emerges is varied but consistent. OpenCode's creator Dax Raad reported that every single inbound enterprise request over the past month was about optimizing spend, and that demand for their cheaper hosted inference service blew past expectations because big companies want models that are cheaper but still capable. Two cutting-edge companies told Orosz they have no obvious return on investment yet, but feel they have no choice but to pay the "intelligence premium" for state-of-the-art models, so they're turning to smart model routing that picks a model based on the use case and the prompt. DoorDash pushes responsibility down to individual devs with high monthly token limits that you have to justify exceeding, plus a plan for being more efficient next month. One of the largest US retirement-savings firms now caps Copilot tokens and downgrades devs to the cheaper, less capable models once the cap is hit. And several revenue-generating startups have just decided to hand devs subsidized Claude Code Max or Codex Max subscriptions rather than eat expensive API pricing.

The thread tying all of this together is what Orosz calls a new bottom-up focus on AI efficiency. He's been noticing more internal knowledge-sharing sessions about using AI efficiently, and they're coming from engineers with no top-down mandate. People are experimenting with cheaper models for simpler tasks, routing, and trimming token usage, because they can see the meter running. His prediction is worth sitting with: in upcoming performance and promotion cycles, engineers who save on token costs may get rewarded, the same way teams got rewarded two years ago for cutting third-party vendor bills. The classic career advice still holds, that the cleanest way to show impact is to translate your work into money, either revenue made or costs saved. And there's a nice bit of irony he points out at the end. The savings that earn promotions will come from exactly the places that did the rocketing spending in the first place.

What's the author quietly stepping around? A few things. First, this is a free reprint of one topic from a two-week-old paid issue, so the framing is partly a sales funnel, and the sample is a handful of anecdotal conversations rather than a survey. Second, nobody in the piece seriously entertains the possibility that the productivity gains are real but just unmeasurable with current tooling, which would change the whole "no ROI" conclusion. Third, the "engineers are doing this bottom-up with no mandate" narrative is a little too tidy. Engineers optimizing token spend right before review season, in a market where headcount feels precarious, is not pure grassroots enthusiasm. It's also self-preservation, and Orosz hints at that with the promotion angle without quite naming the pressure underneath it.

Key takeaways:

Uber reportedly burned through its full 2026 AI budget by mid-March, and its leadership says the link between AI activity metrics and shipped features isn't there yet.
Cost pressure is showing up across the board: enterprise demand for cheaper inference, model routing at cutting-edge firms, per-dev token limits at DoorDash, model downgrades at a large traditional company, and subsidized Max subscriptions at startups.
Model routing, picking a model based on use case and prompt complexity, is emerging as a common cost lever.
The efficiency push is largely bottom-up, with engineers running their own knowledge-sharing sessions on cheaper AI usage.
Expect token-cost savings to become a rewarded skill in performance and promotion cycles, mirroring how vendor-bill savings were rewarded a couple of years ago.

Why do I care: If you build or lead frontend teams, this is the moment to get ahead of a conversation that's coming to your org whether you want it or not. The practical move is to treat AI spend like any other line item you'd profile and optimize: instrument it, route cheap models to the boring tasks like renaming things or writing a test stub, and reserve the expensive top-tier models for the gnarly architecture and debugging work where they actually earn their keep. The deeper point, and the uncomfortable one, is that "I write code faster with AI" is not an outcome your CFO can see. Shipped features and saved dollars are. So start keeping receipts now, because the engineers who can say "I cut our team's token bill by forty percent and we shipped the same roadmap" are going to look very good when the budget questions land. This is part business story, but it lands squarely on day-to-day engineering decisions about which tools you reach for and how you justify them.

The Pulse: a trend of trying to cut back on AI spend within eng departments?