An AI coding workflow that holds up under audit

Crunch week is when the summaries shrink to bullet vibes. The agents kept shipping, the diffs kept landing, and the team discovered that the bottleneck had quietly moved from typing speed to traceability, right when nobody had time to rebuild it. An AI coding workflow that holds up is built for that week, not the calm ones. An AI coding workflow is the agreed loop, brief, edit, verify, receipt, review, that agent-assisted changes follow from prompt to merge. The receipts are what make it auditable later.

The week the summaries shrank

Counter-thesis: trust does not scale when receipts stay in chat, and a faster agent only moves the queue to the part of the system that cannot read chat.

The wrong path: We believed smaller tasks guaranteed safer autonomy. We watched that assumption fail during crunch weeks, when summaries shrank to bullet vibes and the rules files quietly contradicted the skill the agent had just activated.

Diagnosis: Chesterton's fence, unlabeled. Agent diffs remove and rebuild fences constantly, and a reviewer who cannot see why a fence moved has two bad options: block everything or trust everything.

Thesis: traceability is the real throughput lever.

The receipts the audit will ask for

Ritchie-style pragmatism applies: make traceability easy before you make generation easy.

Recursive handoff blur. Chained agents return summaries that omit child-owned paths, the telephone game with commit access.

Named fix: Child receipt block. Every child returns the paths it touched, the commands it ran, and the tests proving regression guards. Parents stop green-lighting mystery diffs.

Review queue theater. CI is green and reviewers still ask why this approach, with no written answer anywhere.

Named fix: Decision stub. The PR template forces three lines: constraints considered, rejected alternatives, verification proof. The fence gets a label before anyone moves it.

Cursor scope fog. Teams shipping Cursor agent work weekly watch .mdc language sound precise until reviewers argue about what it meant. Rules compete with chat memory.

Named fix: Scope ledger. The parent chat carries a five-line ledger: goal, allowed paths, forbidden paths, verification command, merge owner. Review checks ledgers against diffs instead of debating prompts.

Claude permission creep. On shared laptops, Claude Code bash approvals become muscle memory, and permission literacy needs file-backed precedence.

Named fix: CLAUDE.md supremacy clause. The top of CLAUDE.md states which hooks win, which folders require human eyes, and where temporary overrides live. Sessions stop inventing policy mid-run.

---
description: Delegation boundary snapshot (adapt globs to your repo)
globs:
  - "**/*"
alwaysApply: false
---

- Cursor: keep scopes explicit in `.mdc`; forbid undeclared MCP domains.
- Claude Code: cite `CLAUDE.md` precedence before expanding bash scope.
- Codex: ensure `AGENTS.md` carries replay-friendly verification notes for CLI runs.

This routes through our methodology at the Review gate: parallel agent output must be inspectable without replaying sessions. The companion patterns live on the agentic coding governance page, and specs and tests as the stable stack covers the contract these receipts are checked against.

Audit questions worth automating

An auditable workflow answers these from the PR body alone.

Gate	Question
Replay proof	Which commands prove regression guards?
Receipt match	Does the PR body list scopes + verification transcript?
Rules precedence	Which `.mdc`, `SKILL.md`, or `CLAUDE.md` governed behavior?
Connector truth	Which MCP servers fired, and were they expected?

Synthesis: agents are relief crews; the blueprint still belongs to the humans standing outside the trench.

If your repo cannot state boundaries plainly, agents will guess, and guessing is the one behavior that gets worse with scale.

Best ways to use this research

Best for: Cursor teams deciding which rule, subagent, skill, or MCP boundary to standardize next in their AI coding workflow.
Best first artifact: turn the child receipt block into a .mdc rule, AGENTS.md note, subagent receipt, or review checklist before the next automated run.
Best comparison angle: compare the receipt-first loop against the current Cursor review path, connector scope, and team rule file; keep the path that leaves the shortest auditable trail.

Common questions

What does a good AI coding workflow look like? A loop where every step leaves an artifact: a scope ledger before the run, receipts and transcripts during it, a decision stub in the PR, and precedence files in the repo. The test is whether a reviewer can defend the merge without replaying the chat.

How do we audit agent work after the fact? From the receipts. Child receipt blocks list paths, commands, and regression tests; decision stubs preserve constraints and rejected alternatives; the scope ledger shows what was allowed. If those artifacts are missing, the audit becomes archaeology, and archaeology during crunch week does not happen.

How do we make AI-written code easier to review? Shrink what the reviewer must reconstruct. Ship the verification command and its output with the diff, keep scopes in the PR body, and label every removed fence with the reason. Reviewers move fast when the narrative arrives with the change.

Next step

The white paper turns this loop into a checklist your next audit can run against: read the white paper.

An AI coding workflow that holds up under audit

The week the summaries shrank

The receipts the audit will ask for

Audit questions worth automating

Best ways to use this research

Common questions

Further reading

Next step

Related training topics

Related research

MCP training for engineering teams

Codex workspace agents need repo rules

Fast mode is not the default: when fast models earn it

Ready to start?

The week the summaries shrank

The receipts the audit will ask for

Audit questions worth automating

Best ways to use this research

Common questions

Further reading

Next step

Related training topics

Cursor subagents, skills, rules, and MCP for teams

Cursor team conventions for engineering orgs

Cursor CLI workflows for production codebases

MCP training for engineering teams: servers, skills, workflows

Related research

MCP training for engineering teams

Codex workspace agents need repo rules

Fast mode is not the default: when fast models earn it

Ready to start?