Back to Research

Cursor Composer layers in agentic coding

A field guide to Cursor Composer layers in agentic coding: decision stubs, scope ledgers, and precedence files that keep work reviewable.

Schloß am StromCastle by the River, landscape painting by Karl Friedrich Schinkel (1820)
Rogier MullerApril 2, 20265 min read

The Composer run looked finished. On a live workshop call the diff read confident, the summary read calm, and the reviewer who had to merge it before the demo could not reconstruct a single decision underneath. That fragile shortcut is what Cursor Composer makes cheap: output that ships fast while the decision trail stays fuzzy. Cursor Composer is the agent pane in Cursor that drafts and applies multi-file changes in one loop. The loop is the easy part. The layers around it decide whether the work survives review.

Confident output, fuzzy trail

Counter-thesis: trust does not scale with model quality; it scales with receipts, and receipts that stay in chat do not count.

The wrong path: We believed wider tool access would unblock our seniors fastest. We tried it while connectors multiplied faster than the ownership map, and the expensive bug turned out to be permission drift that nobody had signed.

Diagnosis: the swiss-cheese model explains the trap. Each layer of an agent stack, the prompt, the rules file, the hooks, the review, has holes; incidents happen when the holes line up, and an unwritten decision trail lines them up by default.

Thesis: every layer of the agent stack must leave a receipt the next layer can check.

Four layers, four receipts

A Composer workflow stays defensible when each layer writes down what the next layer needs.

Review queue theater. CI is green and reviewers still ask why this approach, with no written answer. Humans optimize for checks passing, which is Chesterton's fence without labels.

Named fix: Decision stub. The PR template forces three lines: constraints considered, rejected alternatives, verification proof. Debate moves from vibes to explicit tradeoffs.

Cursor scope fog. Teams shipping Cursor agent work weekly watch .mdc language sound precise until reviewers argue about what it meant. Rules compete with chat memory.

Named fix: Scope ledger. The parent chat carries a five-line ledger: goal, allowed paths, forbidden paths, verification command, merge owner. Review shifts from debating prompts to checking ledgers against diffs.

Claude permission creep. On shared laptops, Claude Code bash approvals become muscle memory. Hooks help, but permission literacy still needs file-backed precedence.

Named fix: CLAUDE.md supremacy clause. The top of CLAUDE.md states which hooks win, which folders require human eyes, and where temporary overrides live. Sessions stop inventing policy mid-run.

Codex replay gaps. Teams relying on Codex CLI merge green runs whose transcripts never reached review.

Named fix: Replay sandwich. AGENTS.md mandates an intent line, the command transcript, and a diff summary before the PR. Review becomes reproducible without standing behind someone's terminal.

---
description: Delegation boundary snapshot (adapt globs to your repo)
globs:
  - "**/*"
alwaysApply: false
---

- Cursor: keep scopes explicit in `.mdc`; forbid undeclared MCP domains.
- Claude Code: cite `CLAUDE.md` precedence before expanding bash scope.
- Codex: ensure `AGENTS.md` carries replay-friendly verification notes for CLI runs.

In our methodology this belongs in Document before it ever reaches Review: the handoff has to survive without the original operator in the room. The cluster's other patterns live on the agentic coding governance page, and an AI coding workflow that holds up under audit shows the same receipts running end to end.

The merge gate

A layered workflow is working when these answers come from files, not from memory.

Gate Question
Connector truth Which MCP servers fired, and were they expected?
Reviewer path Can someone unfamiliar trace intent without chat replay?
Risk routing Were red folders touched, and who approved?
Replay proof Which commands prove regression guards?

Synthesis: reliability is layered receipts, intent, scope, verification. Skip a layer and the stack tips.

Verification stays a ritual, not a mood. None of this replaces architecture judgement; agents accelerate execution, not ownership.

Best ways to use this research

  • Best for: Cursor teams deciding which rule, subagent, skill, or MCP boundary to standardize next around Cursor Composer work.
  • Best first artifact: turn the scope ledger into a .mdc rule, AGENTS.md note, subagent receipt, or review checklist before the next Composer run.
  • Best comparison angle: compare the layered workflow against the current Cursor review path, connector scope, and team rule file; keep the path that leaves the shortest auditable trail.

Common questions

How do we keep Cursor Composer output reviewable? Give every layer a receipt: a five-line scope ledger in the chat, a decision stub in the PR, precedence rules in the repo, and a verification command whose output is pasted or linked. A reviewer should trace intent without replaying the session.

What is a decision stub in a PR? Three forced lines in the PR template: constraints considered, rejected alternatives, and verification proof. It exists because green CI answers what changed but not why this approach. The stub moves that debate from vibes to explicit tradeoffs a reviewer can check in seconds.

Why does agent output sound confident but fail review? Because confidence is a writing style and review needs evidence. Composer output reads calm whether or not the decision trail exists. When the trail stays in chat, the holes in prompt, rules, and review line up, and the merge becomes a guess.

Further reading

Next step

If your Composer workflow ships faster than your review can explain it, tell us what the trail looks like today and we will help you layer it: contact.

Related training topics

Related research

Continue through the research archive

Ready to start?

Transform how your team builds software.

Get in touch