Johnny Butler

May 12, 2026

I didn't want to change good software delivery to suit agents

Your team is shipping faster. Your review queue is longer. That’s not a coincidence.

This is the pattern I keep seeing from engineering leaders, founders, and developers who have adopted AI-assisted development seriously: the output speed is real, but the delivery model hasn’t caught up. PR review backs up. Rework increases. Decisions live in chat threads. Context gets lost between runs. Quality becomes harder to trust — not because the code is bad, but because the process around it wasn’t built for this pace.

I hit this wall twice. The first time with code. The second time when I built an agentic marketing department and the first drafts came back technically competent and obviously wrong — correct words, wrong voice, no conviction. Both times the AI wasn’t the problem. The delivery model around it was.

The instinct I kept resisting was to slow everything down. Heavier specification upfront. More gates. A waterfall wrapper around agentic work. I understood the logic — if agents are unpredictable, constrain them — but I’d seen that movie before and it doesn’t end well. Over-specified, gate-heavy delivery is slow and brittle in a different way. It kills the adaptation that good iterative work depends on.

The best delivery I’ve seen across startups doesn’t work like that. It’s iterative: small slices, fast feedback, room to adapt, enough structure that quality doesn’t collapse when pace increases. The structure is real, but it’s lightweight and embedded in the work itself — not imposed from outside through checklists and approval chains.

I didn’t want to change good software delivery to suit agents. I wanted agents to work inside good software delivery.

That distinction took a while to arrive at. Once it landed it became the foundation for everything I’ve been building.

What Software Dark Factory is

Software Dark Factory is a governed delivery model for AI-assisted work. The premise is that agents shouldn’t just generate code. They should work inside a delivery process with explicit prompts, acceptance criteria, risk framing, verification steps, evidence, handoffs, and review discipline built into the work itself — not bolted on afterward.

The dark factory framing comes from manufacturing: a lights-out facility where automated systems run without human intervention at every step, but inside a tightly governed process. The automation is real. The governance is what makes it trustworthy.

For software, the equivalent isn’t a fully autonomous AI pipeline with no human in the loop. It’s a delivery model where agents do the execution work and humans stay in the decision and review roles — with evidence produced at every stage so review is fast, grounded, and trustworthy.

The goal isn’t speed for its own sake. It’s faster delivery without lowering quality expectations.

Why I built it

I started building this for myself. Years in startups taught me that when resources are constrained, fast-but-fragile is a bad trade. Speed only matters if the work is still sound enough to ship, maintain, and trust.

When I started using AI agents seriously in my own work, the speed was immediately obvious. The governance gap was just as obvious. I needed a way to apply proven engineering practices — delivery discipline, evidence, review structure — to every app I work on, whether greenfield or legacy, whether working alone or with a team.

What I built is now portable. I’m making Software Dark Factory available to teams facing the same problem: adopting AI-assisted delivery but wanting stronger governance, clearer evidence, and more trustworthy execution.

If this resonates — if your team is moving faster but review, quality, and confidence are becoming the bottleneck — I’d be interested to compare notes.