March 22, 2026
Software delivery gets easier to improve once you make the timing visible.
Software delivery gets easier to improve once you make the timing visible. One thing I’m starting to like in our dark factory workflow is recording delivery timing directly on the PR. Not just token usage or the final diff, but how the work actually moved: context gathering, implementation, verification, external wait, and total elapse...
Read more
Read more
March 20, 2026
If your PR is the first time the change is properly validated, the feedback loop is too slow.
One of the simplest dark factory wins for us has been pushing more verification earlier, before the PR ever hits shared CI. That means running a meaningful slice of the expected checks locally first, then attaching the evidence and caveats to the work. Cloud CI still matters. It is the shared, trusted gate. But it should confirm qualit...
Read more
Read more
March 19, 2026
I’ve started treating agent friendliness as a core product feature, not an add-on.
One thing I’m now adding to personal project is an agent manifest. The idea is simple: when an agent lands on the product, it should understand what the product does, what state the account is in, what actions are available, what the next best step is, and what guardrails apply. Not just for fully autonomous agents either. Also for hum...
Read more
Read more
March 18, 2026
Agentic development is the pair-programming model I actually wanted.
I always understood the theory of pair programming. In practice, it often felt like two people doing work one strong engineer could do alone. AI changed that for me. I always got the value on paper: two brains, shared context, faster feedback, fewer blind spots, better knowledge transfer. But especially in startup and scale-up environm...
Read more
Read more
March 17, 2026
Most teams are not building software dark factories.
Most teams are not building software dark factories. They are bolting AI onto old delivery models and mistaking that for the shift. Coding faster with ChatGPT, Copilot, or Codex is useful, but that is still surface level. The real shift starts when the workflow itself is redesigned around AI: scoped prompts, execution runs, validation ...
Read more
Read more
March 13, 2026
Blockchain has gas fees. AI has token economics. Same underlying lesson: computation is not free.
Blockchains made computation cost visible through gas. AI is doing the same through tokens, model choice, and reasoning effort. That is a useful mental model. In crypto, you do not send every transaction the same way. If it is low value, you want it cheap. If it is urgent, sensitive, or high value, you may pay more or choose a differen...
Read more
Read more
March 13, 2026
Why put AI usage on the PR at all?
Because once AI becomes part of delivery, the business question quickly becomes: what did this take, and what did it cost? For this slice, the PR includes: Current AI Usage in the PR: That is why I think it belongs on the PR. Not because the PR is necessarily the perfect long-term home for it. But because it gives both the reviewer and...
Read more
Read more
March 13, 2026
If AI is helping write the PR, the PR should show how AI was used, how much effort it took, and what it cost.
At some point, the business conversation stops being “this is cool” and becomes “what is this costing us, what is it saving us, and is it worth it?” Ultimately, it gets boiled down to the number that matters most: cost. I’ve started rolling out a lightweight way to show AI usage inside the PR itself. Not prompt theatre. Not vague “AI h...
Read more
Read more
March 12, 2026
“Whatever mess AI gets us into, AI will get us out of.”
“Whatever mess AI gets us into, AI will get us out of.” I hear that assumption a lot in business conversations. It hasn’t been my experience so far. The question I keep asking is: How do we get businesses to take AI-driven software risk seriously before they have to feel the consequences themselves? Unless you’ve lived through the fall...
Read more
Read more
March 11, 2026
Don’t just tell the model what good code looks like. Show it, then make it prove it followed it
Don’t just tell the model what good code looks like. Show it, then make it prove it followed it. LLMs already know plenty of patterns from the internet, but that also means they can pick up plenty of bad implementations too. A lot of AI coding guidance is still too vague: “use best practices” “follow SOLID” “keep it maintainable” That ...
Read more
Read more
March 10, 2026
Codex “Fast” Mode Is Wild (and It Made Repo Hygiene Even More Important)
I tried Codex “Fast” properly today for the first time. It’s… ridiculous. The speed feels like you’ve removed friction from the entire loop: ask → change → verify → iterate. You can keep momentum in a way that’s genuinely hard to do in a big codebase with normal latency. But there’s an obvious trade-off: Fast is expensive. (And the UI ...
Read more
Read more
March 10, 2026
One of the quickest “AI productivity” wins in a monolith isn’t another model.
It’s less noise, In the AI era, context is cost. I’m doing a sweep to remove dead repo assets (old CSVs/exports/screenshots/sample data) and trimming logs (lower verbosity + sensible rotation/cleanup). Why? Because in big codebases, context is cost: navigation/search gets harder agents pull in irrelevant files token/context burn goes u...
Read more
Read more
March 8, 2026
Vibe coding is great for prototypes. Production needs discipline.
LLMs can generate a lot of code quickly but the quality range is massive unless you force the work through guardrails. So I don’t ask my agent for “an answer”. I ask it to ship a solution through a playbook (built from ~20 years of patterns/practices I actually trust in production). The key rule: It must cite which playbook rules it ap...
Read more
Read more
March 7, 2026
What’s next in the queue — and how does it move us toward the North Star?
I ask my software factory: “What’s next in the queue — and how does it move us toward the North Star?” It replied with 5 small, verifiable jobs — each with a clear “why” (contract/telemetry, removing manual paths, hardening inputs, runner policy, cross-repo proof). This is the real agentic unlock for me: not “bigger prompts” — tighter ...
Read more
Read more
March 7, 2026
I’ve started batching work through my software “dark factory”.
Instead of running one job at a time, I queue a handful of small, reversible slices — and the factory sends back a single execution report when it’s done. That report includes: PR links (merged sequentially) exactly what shipped per job verification commands + outputs the prompt/run artifacts for audit + replay The mental shift: I’m no...
Read more
Read more
March 7, 2026
I’ve updated my profile to “Software Factory Manager”
AI hasn’t removed engineering work. It’s changed where the leverage is. Writing code is getting cheap. Trust is getting expensive. So my day-to-day is shifting up the stack: Turning intent into a spec (goal + acceptance criteria) Setting constraints + guardrails (what we will / won’t do) Insisting on verification (tests + CI as the gat...
Read more
Read more
March 5, 2026
One small change that’s made AI-assisted refactoring feel production-ready for me:
My agent has to state which best practices it applied — and why. Not just “here’s the diff”, but “here’s the thinking”: what refactoring move was used (extract class, etc.) what rules/patterns it followed how it kept the change small + test-backed what evidence it captured (tests/CI) It turns AI output from “looks fine” into something ...
Read more
Read more
March 5, 2026
From Novice to Senior: Refactoring a Whole Codebase Overnight With a Software “Dark Factory”
I planned to extract the Dark Factory engine from my personal project as a standalone tool for work. But I wasn’t happy with the code standard — both the factory itself and some of the code it had produced. It was functional, but naïve. So I did something different. I used Codex plan mode to collate: • the software engineering practice...
Read more
Read more
March 4, 2026
The Hidden Cost of the God Table (and how AI can make it worse)
Every production system has at least one “god table”. It’s usually the biggest, busiest table in the database — the one that knows about everything: orders, users, products, payments, delivery, invoices, discounts, status, audit trails… and inevitably, a lot more. It becomes a magnet for behaviour. And that’s where things get dangerous...
Read more
Read more
March 3, 2026
The Most Underrated Scaling Pattern in Startups & Scale-Ups
One of the most underutilised design patterns in startups and scale-ups is the adapter pattern. I used it today and it reminded me why it’s so valuable: it’s not just “clean code”. It’s commercial leverage. How lock-in happens (and why it hurts later) Early on, startups get generous free tiers and credits: AWS/GCP, map providers, email...
Read more
Read more
March 2, 2026
My dark factory software workflow now has a UI.
I’m trialling a simple flow where technical and non-technical teammates can request a factory job (e.g. “add an API endpoint”), and the factory turns that intent into: Scoped prompt spec (goal + acceptance criteria) Run log (commands/tests) CI as the gate (green or it didn’t happen) PR with evidence attached Deployed to staging env for...
Read more
Read more
February 28, 2026
Dark Factory: “No Humans Should Write Code” (and what it taught me)
A few weeks ago I read Simon Willison’s write-up on StrongDM’s “Software Factory” approach. The line that stuck with me wasn’t even about agents or tooling — it was the mantra: • Code must not be written by humans • Code must not be reviewed by humans I found it fascinating… and honestly a bit uncomfortable. Not because it’s wrong, but...
Read more
Read more
February 27, 2026
Evidence > vibes.
This is what I’m aiming for with AI-assisted dev: every job produces a small PR + CI green + a prompt spec + a run log (commands/tests) so you can audit what happened. Diff is output. Evidence attached.
Read more
Read more
February 27, 2026
I’ve stopped “prompting an AI” and started running a software factory.
Every PR now includes: the prompt spec (goal + acceptance criteria) the run log (what the agent did + command/test outputs) the checks as the gate (CI green or it isn’t done) It turns AI work from vibes into something you can audit, replay, and trust: instructions in → software out → evidence attached. Screenshot is a real example: pro...
Read more
Read more
February 26, 2026
I Thought AI Would Take the Fun Out of Engineering — It Didn’t
I was worried AI would take the fun out of engineering. My early experiences felt a bit like “autocomplete on steroids” — faster, but less satisfying. It’s gone the other way. I feel energised. The work is more playful now: I give it context, tighten constraints, adjust guardrails, and see what it comes back with — then steer, test, an...
Read more
Read more
February 19, 2026
AI Top Tip: No Green, No Opinion.
When AI generates code, it can look convincing even when it’s wrong. So I don’t assess the implementation first. I run the specs first. In traditional development, you wouldn’t submit (or even entertain) a change that fails CI. Why would you treat AI-generated code differently? If tests are failing, I ask the model to fix them until ev...
Read more
Read more
February 15, 2026
Your Next Customer Might Be an Agent: How to Prepare Without Panic
There’s a lot of fear right now. People can feel the ground moving under how we work, how we buy things, and how businesses acquire customers. The questions are reasonable: • What does this mean for my job? • What does this mean for my business? • Are websites and apps about to become irrelevant? This is my attempt to add some reality:...
Read more
Read more
February 14, 2026
Planner/Worker: The Two-Thread Workflow That Keeps AI Useful
One of the easiest ways to waste time with AI is to do everything in a single long thread. It starts well, then the context grows, quality drifts, and you end up with a messy mix of strategy, partial implementations, and conflicting decisions. You get “context rot”, but you also get something worse: you lose your own clarity. The appro...
Read more
Read more
February 14, 2026
Engineering Interviews Are Changing in the AI Era
Engineering interviews have always been a proxy. We can’t fully simulate real work in an hour, so we use exercises and questions to approximate signal: • how someone thinks • how they trade off speed vs quality • how they handle uncertainty • whether they validate and de-risk • whether they communicate clearly The final code someone pr...
Read more
Read more
February 14, 2026
PR Review Is Changing in the AI Era
PR review was never really about the code. Yes, we look at the diff. But the real thing we’re trying to assess is judgment: • did the engineer understand the problem? • did they make sensible trade-offs? • did they spot the risks? • did they validate behaviour? • did they leave the codebase better than they found it? The uncomfortable ...
Read more
Read more
See more posts »