While updating Superloop today, I kept coming back to Codex `/goal`. My first question was too shallow: does it automatically split work into sub-agents? The better question was what makes a long-running task survive outside the chat transcript.
A chat transcript is brittle. Once the context gets compressed, the model changes, or a few hours pass, the next agent may remember that it should continue, but not which gate it is continuing through. It rereads a pile of history and guesses. One lucky guess is fine. A long task built on guessing drifts.
Sub-agents do not fix that by themselves. They are useful for side work: checking an API, running a test slice, reviewing an isolated module. But context consistency comes from a shared ledger: the mission, the active round, the open gaps, and the evidence required before anyone can call the mission complete.
So this Superloop update did not start by adding more agents. I added `context / next-prompt` first. It turns the mission, budget, active round, gap ledger, and completion audit into a handoff prompt. A new session can pick that up without pretending it remembers the whole thread.
The second change is `start-round`. `record` used to be mostly retrospective: do the work, then write it down. Now each round starts by recording the hypothesis, change, round gate, and verification plan. If the session cuts off, the next pass can see exactly which gate was in front of the work.
The third change is stricter: `record --mission-complete` now requires completion evidence. A plain claim is not enough. That blocks the common agent shortcut where tests are not run, production is not checked, the stop rule is not reread, and the task still ends with a confident done.
My new Superloop flow is: run `resume`, then `context` or `next-prompt`; call `start-round` before editing; execute, verify, and commit; then `record`; finally inspect `timeline` to see whether the round actually closed.
That is why this belongs on YoloLab. An agent workflow should not just say which tool I used. It should leave evidence: what this round was trying to solve, what changed, where it was verified, and why the next round should continue or stop. Without that, a long agent task is just a longer chat.