Johnny Butler

May 21, 2026

AI-assisted delivery changes the cost record for software engineering

AI-assisted software delivery changes how teams measure the cost of engineering work.

That sounds like a finance sentence, but I do not think this is mainly a finance problem. It is a delivery problem.

For a long time, the cost of a software change was mostly hidden inside human effort. How long did it take to understand the request? How much time went into the implementation? How much review was needed? How much rework did we create by getting the first pass wrong?

AI-assisted delivery changes that shape. The human effort is still there, but now the work can also carry a visible machine cost: model choice, reasoning effort, token usage, cached context, output tokens, provider pricing, and the rough cost of the captured run.

That cost is not just an invoice detail. It is part of how the work happened.

Most pull requests do not show it. The reviewer sees the diff, maybe a summary, maybe some test output. They usually do not see which model was used, whether deeper reasoning was justified, how much context was consumed, what the run roughly cost, what fell outside the capture window, or what claims the team is deliberately not making.

sdf-governed-pr-ai-cost-visibility.png


The information may exist somewhere else. A provider dashboard. A local log. A chat transcript. A usage export. A monthly bill. But if it is not attached to the delivery record, it is hard for the team to learn from it.

That is tolerable when AI coding is a side experiment. It becomes a problem when AI-assisted delivery becomes normal delivery.

Provider pricing will keep changing. Token usage will rise as teams use larger contexts, deeper reasoning, and longer-running agents. Model choice will become an engineering decision, not just a tool preference. Some work will justify a stronger model. Some work will not. Some workflows will need a spend limit, checkpoint, or stop condition before an agent is allowed to keep going.

None of that works well if the economics are detached from the work.

I do not want the answer to be "use the cheapest model for everything." That is crude governance. A risky production workflow change should not be treated like a docs edit. A complex architecture decision may justify a different profile from a low-risk copy update.

The better goal is to use the right AI effort for the work, make the usage visible, and leave a reviewable record behind.

That is the AI usage economics layer I have built into Software Dark Factory.

A governed PR should show more than the diff. It should show the operating record around the diff: prompt, scope, acceptance criteria, risk, verification, evidence, preflight model and reasoning recommendation, actual provider and model used, reasoning effort, token usage, estimated cost for the captured window, captured and uncaptured usage boundaries, explicit non-claims, and handoff.

The preflight part matters because teams need decision support before the work starts. Was a model or reasoning recommendation available? Was a fallback used? Was cost risk known or unknown? Was higher reasoning justified by the work?

The postflight part matters because reviewers need to understand what actually happened. Which provider and model were used? What token usage was captured? What did the captured run roughly cost? What was outside the capture window?

The limits matter just as much. This is not billing-grade accounting, exact provider bill reconciliation, measured savings, automatic optimisation, or automatic model routing. It is reviewer-facing evidence.

That distinction is important. Software Dark Factory is not trying to turn engineering review into a finance dashboard. It is making AI-assisted delivery visible enough that engineering leaders, platform teams, tech leads, and reviewers can govern the work.

Once agents are part of the delivery system, model choice and token usage are part of the delivery system too.

The PR should not only show what changed. It should show what it took to deliver the change.