I’ve stopped “prompting an AI” and started running a software factory.
Every PR now includes:
the prompt spec (goal + acceptance criteria) the run log (what the agent did + command/test outputs) the checks as the gate (CI green or it isn’t done)
It turns AI work from vibes into something you can audit, replay, and trust: instructions in → software out → evidence attached.
Screenshot is a real example: prompt + run log sitting right next to the diff.