How do I start reviewing AI generated code without rereading every line?

Read the receipts first. The decision stub tells you intent, the scope ledger tells you allowed paths, and the verification transcript shows the proof. Then sample the diff only where receipts and reality could diverge. This turns a line-by-line slog into a contract check, and contracts are greppable.

What should a PR for agent-written code contain?

Three things. A decision stub listing constraints considered, alternatives rejected, and verification proof. A scope declaration that matches the folders in the diff. A pasted or linked transcript showing the regression guards ran. If any one is missing, the reviewer is being asked to trust rather than review.

Do green CI checks make an agent diff safe to merge?

No. Green CI proves the commands passed, not that the approach was sound or the scope respected. That exact gap is where checks pass while nobody can explain why the approach won. Pair CI with a decision stub and a scope ledger before anything merges, and the green check goes back to meaning something.

Which receipt should my team standardize first?

Start with the decision stub. It is three lines in your PR template and it shrinks the "why this approach?" thread immediately, so the payoff shows up in your next review. Add the scope ledger next once the stub is a habit, then the replay sandwich for any CLI-driven work.

Reviewing AI generated code defensively

Review the receipts, not the chat. Defensive review is the habit of checking an agent's diff against written scope, stated intent, and pasted verification evidence, instead of rereading every line or digging through a conversation to guess what the agent meant. When a diff arrives without those receipts, you are being asked to trust it, not review it. This piece shows you what receipts to require and how to check them fast.

You will see this break the same way every time. The CI check is green, the diff looks plausible, and a reviewer still asks "why this approach?" because the answer only ever lived in a chat window. Tools like Cursor (Anysphere's AI code editor), Claude Code (Anthropic's coding agent), and Codex CLI (OpenAI's coding agent) all generate diffs quickly. None of them decide what counts as a defensible merge. That part stays with you, and it gets easier once the intent is written down.

Ask the agent for a decision stub

Make every agent PR carry a short stub that explains itself, so the "why this approach?" thread never starts.

A decision stub is three lines in the PR body: the constraints the agent considered, the alternatives it rejected, and the proof that verification ran. That is it. When those three lines are present, review turns into a contract check instead of a guessing game, and the queue moves.

Add this to your PR template today:

## Decision stub
- Constraints considered:
- Alternatives rejected (and why):
- Verification proof (command + result):

The point is not paperwork. The point is that a reviewer can read three lines and know whether the approach was reasoned or just the first thing that compiled.

Keep Cursor's scope explicit in a ledger

When you ship Cursor agent work regularly, .mdc rule language can read as precise until two reviewers argue about what it actually permitted. Rules and chat memory start to disagree, and review drifts into debating prompts.

A scope ledger fixes that. The parent chat carries five lines: the goal, the allowed paths, the forbidden paths, the verification command, and the merge owner. Review then becomes a simple check of the ledger against the diff. Did the agent touch a path it was told to leave alone? You can see that in seconds.

The Cursor agent docs define what the agent can do. They do not decide what it may touch in your repo, and that boundary is yours to write down.

Write a precedence clause at the top of CLAUDE.md

On shared machines, bash approvals turn into muscle memory and permissions quietly creep wider than anyone intended. Hooks help, but they do not settle which rule wins when two of them disagree.

A CLAUDE.md supremacy clause settles it. The top of the file states which hooks take precedence, which folders always need human eyes, and where temporary overrides live. With precedence written down, sessions stop inventing policy in the middle of a run. The Claude Code getting started guide gets you running, but it leaves these precedence calls to you.

Codex CLI teams hit a related gap: greens get merged where no reviewer ever saw the transcript. The commands ran, but the story of what happened did not travel with the PR. A replay sandwich closes that. In AGENTS.md, require an intent line, then the command transcript, then a diff summary before any PR opens. Review becomes reproducible without anyone standing behind a terminal. The Codex quickstart starts a session in minutes; producing the replayable record is still your job.

One boundary file makes the habit portable across all three tools:

---
description: Delegation boundary snapshot (adapt globs to your repo)
globs:
  - "**/*"
alwaysApply: false
---

- Cursor: keep scopes explicit in `.mdc`; forbid undeclared MCP domains.
- Claude Code: cite `CLAUDE.md` precedence before expanding bash scope.
- Codex: ensure `AGENTS.md` carries replay-friendly verification notes for CLI runs.

Undeclared MCP domains are forbidden up there on purpose. The Model Context Protocol specification defines connector capabilities, not your blast radius. The same receipts-first habit anchors agentic coding governance, and in our methodology it lives in the Review stage, where receipts meet responsibility.

Run a four-gate check on every diff

A defensive review is four questions, asked in order. Each one maps to a receipt you already have on hand.

Gate	Question
Risk routing	Were red folders touched, and who approved?
Replay proof	Which commands prove the regression guards ran?
Receipt match	Does the PR body list scopes plus a verification transcript?
Rules precedence	Which `.mdc`, `SKILL.md`, or `CLAUDE.md` governed behavior?

Before you click merge, run the same short list every time:

Scopes in the PR body match the folders in the diff.
Primary-doc links were smoke-checked after publishing edits.
MCP connectors mentioned (if any) list their owners.
Verification command output is pasted or linked.

Hard constraints still belong to humans. Threat models, customer promises, and blast-radius calls stay off autopilot, whatever this quarter's tooling promises. Agents amplify whatever clarity already lives in your files, hooks, and scopes, and they amplify fog with the same fidelity.

Common questions

How do I start reviewing AI generated code without rereading every line? Read the receipts first. The decision stub tells you intent, the scope ledger tells you allowed paths, and the verification transcript shows the proof. Then sample the diff only where receipts and reality could diverge. This turns a line-by-line slog into a contract check, and contracts are greppable.
What should a PR for agent-written code contain? Three things. A decision stub listing constraints considered, alternatives rejected, and verification proof. A scope declaration that matches the folders in the diff. A pasted or linked transcript showing the regression guards ran. If any one is missing, the reviewer is being asked to trust rather than review.
Do green CI checks make an agent diff safe to merge? No. Green CI proves the commands passed, not that the approach was sound or the scope respected. That exact gap is where checks pass while nobody can explain why the approach won. Pair CI with a decision stub and a scope ledger before anything merges, and the green check goes back to meaning something.
Which receipt should my team standardize first? Start with the decision stub. It is three lines in your PR template and it shrinks the "why this approach?" thread immediately, so the payoff shows up in your next review. Add the scope ledger next once the stub is a habit, then the replay sandwich for any CLI-driven work.

Start with one receipt

Add the three decision-stub lines to your PR template now, then time one agent PR with receipts against one without. If you want this installed on your own repos rather than slideware, our training sets up receipts-first review with your team.

Reviewing AI generated code defensively

Ask the agent for a decision stub

Keep Cursor's scope explicit in a ledger

Write a precedence clause at the top of CLAUDE.md

Run a four-gate check on every diff

Common questions

Start with one receipt

Related training topics

Related research

AI coding tools that last past the demo

Codex workspace agents need repo rules

Agentic coding governance for engineering teams

Ready to start?

Ask the agent for a decision stub

Keep Cursor's scope explicit in a ledger

Write a precedence clause at the top of CLAUDE.md

Run a four-gate check on every diff

Common questions

Start with one receipt

Related training topics

Cursor subagents and team skills for engineering teams

Cursor rules training for engineering teams

Cursor MCP training for engineering teams

AI code review habits for generated code

Related research

AI coding tools that last past the demo

Codex workspace agents need repo rules

Agentic coding governance for engineering teams

Ready to start?