Johnny Butler

June 4, 2026

The Token Spend Reversal: How Heavy Spend Went From Virtue to Waste

A year ago the message to engineers using AI was: use the agents fully, lean in, let them run. The ones burning the most tokens were read as the ones embracing the future — and rewarded for it. The cautious ones, the engineers asking what all this was costing, got quietly filed under not forward-thinking enough.

Now the message is flipping — watch your token spend, waste will be noticed. The same caution that looked like dragging your feet a year ago is starting to look like judgement.

That reversal is worth sitting with. Coming up through startups, where resources are tight by default, I learned early that constraints are not the enemy of good engineering. They are how good engineering happens.

Tight constraints force trade-offs. Trade-offs force decisions. And the quality of an engineer shows up most clearly in the decisions they make when they cannot have everything. Enough constraints to know where the boundaries are, enough freedom to be creative inside them — that is the condition the best people thrive in. Unlimited budget hides weak judgement. Constraint exposes it.

So the current AI message is really an old lesson arriving late. Some of "be mindful of tokens" is honest economics. Some of it, I suspect, is cover for reducing headcount. Either way the underlying truth was always there.

Token economics was high on my list from the very start, because tokens feel cheap right up until they aren't. The more autonomy you give an agent, the bigger the task, the further you wean it toward done, the more it burns. A faster car driven more often still needs more petrol. Cheap per unit, expensive at scale.

That is why Software Dark Factory records token economics on every run, built in early rather than bolted on later. If constraint makes a human engineer make better trade-offs, it does the same for an agent. And the record is not just a bill — it is a signal. It can inform which provider, model and reasoning effort a task actually warrants, whether cloud is earning its premium over local, and sometimes the most valuable answer of all: that a change was not worth making given what it cost to make it.

I've attached the rough shape of what a single run captures. Look at the last block — it records what the number is not: estimated, not billing-grade, not an automatic optimisation. The discipline is not in spending less. It is in never spending without knowing.

Screenshot 2026-06-04 at 14.05.24.png


The engineers worth keeping were never the ones who spent the most. They were the ones who knew what they were spending, and why.