Back to Research

Align AI Code Review

A team convention for safer AI-assisted review with Cursor rules, AGENTS.md boundaries, MCP checks, and one checklist.

Ondergaande zon bij Villerville Sunset at Villerville, landscape painting by Charles-François Daubigny (1874).
Rogier MullerJune 29, 20269 min read

When an engineering team is split on AI code assistants, do not start by arguing about the assistant. Start by agreeing on what must be reviewable: the prompt context, the changed code, the tools touched, and the proof that the change works.

A team convention is the smallest shared rule set that makes AI-assisted work predictable across people, editors, and agents. For Cursor, Anysphere's AI code editor, that usually means repo rules, an AGENTS.md boundary, MCP limits, and a checklist reviewers can actually use.

AI coding training for teams works best when it teaches this shared workflow before it teaches clever prompting.

Put the review boundary in the repo

Write the rule where the work happens. A reviewer should not need to remember a Slack thread, a workshop slide, or a manager preference to know whether AI-assisted code is acceptable.

For a Cursor team, start with a small .cursor/rules/ai-review.mdc rule and a matching AGENTS.md. The Cursor rule tells the assistant how to behave in this repo. The AGENTS.md tells humans and coding agents what boundaries apply before code is merged.

The trap is making the rule moral instead of operational. “Use AI responsibly” is too vague to review. “Do not let an agent change auth, billing, or migrations without a human design note” gives reviewers something to enforce.

This is the core of AI coding governance for teams: fewer abstract policies, more repo-local defaults.

Treat MCP as a tool boundary, not magic

MCP is a protocol that lets coding agents connect to external tools and data sources through a common interface. That can mean GitHub, Slack, docs, databases, issue trackers, design tools, or private knowledge stores.

The governance question is simple: what can the agent read, what can it write, and what must be tested without the model in the loop?

A useful example is Ocarina, a small Show HN project from the msradam/ocarina GitHub repository that automates and tests MCP servers from YAML without relying on an LLM. The important idea is not the specific tool. It is the habit: integration behavior should be testable as configuration, not trusted because the agent sounded confident.

The trap is giving an agent broad MCP access and then asking code review to catch everything afterward. Review is much easier when the MCP server exposes narrow tools, deterministic fixtures, and logs that show what happened.

Make AI-assisted changes boring to review

A good AI-assisted pull request should look like a normal pull request with better receipts. The author should say where AI helped, what they verified, and where a human made the final call.

This matters because reviewers do not review vibes. They review diffs, tests, risk areas, and assumptions. The durable pattern is an ai code review workflow for teams: rules in the repo, tool boundaries in AGENTS.md, tests around MCP, and one checklist reviewers actually use.

The trap is asking reviewers to judge whether “the AI did a good job.” That is not reviewable. Ask whether the code follows repo conventions, whether generated tests cover the real risk, and whether external tool use stayed inside the approved boundary.

If your team already has a review habit, do not replace it. Add the AI-specific checks beside the existing ones, like the conventions in Shared Workflows for Safer Review.

Paste this review convention

Use this as a starter convention. Keep it short enough that a reviewer can enforce it in under two minutes.

# AI-assisted change convention

## Where this lives

- Cursor repo rule: `.cursor/rules/ai-review.mdc`
- Agent boundary: `AGENTS.md`
- Pull request template: `.github/pull_request_template.md`

## Cursor rule stub

---
description: Apply to all AI-assisted code changes in this repository.
alwaysApply: true
---

When helping with code in this repository:

- Prefer small, reviewable diffs over broad rewrites.
- Do not change authentication, authorization, billing, migrations, secrets, or deployment workflows without an explicit human design note in the PR.
- Before editing code, identify the files and tests likely to be affected.
- After editing code, summarize the behavior change, tests run, and remaining uncertainty.
- If an MCP tool is used, name the tool, the action taken, and whether it was read-only or write-capable.

## AGENTS.md boundary

Agents may:

- Read repository code, test files, docs, and local fixtures.
- Propose code changes in feature branches.
- Use approved read-only MCP tools for issues, docs, and pull request context.

Agents may not:

- Write to production systems.
- Rotate secrets or edit secret stores.
- Merge pull requests.
- Modify database migrations, payment flows, auth policy, or CI release steps without a human design note.

## Pull request checklist

Author checks:

- [ ] I marked whether AI helped with planning, code, tests, review, or docs.
- [ ] I reviewed every AI-generated change before opening the PR.
- [ ] I ran the smallest meaningful test set and listed the command.
- [ ] I checked generated tests for false confidence and shallow assertions.
- [ ] I listed any MCP tools used and whether they were read-only or write-capable.
- [ ] I called out files or behaviors where I want extra human review.

Reviewer checks:

- [ ] The diff is small enough to review normally.
- [ ] The change follows the repo's architecture and naming conventions.
- [ ] Tests cover the risky behavior, not just the generated implementation.
- [ ] Any MCP/tool use stayed inside the AGENTS.md boundary.
- [ ] Sensitive areas have a human design note or are split into a separate PR.
- [ ] The reviewer can explain the final behavior without trusting the assistant's summary.

The adoption path is simple. One engineer proposes the convention in a normal PR, two frequent reviewers approve it, and the team revisits it after five AI-assisted pull requests.

The enforcement rule is even simpler. If the checklist is missing or the MCP boundary is unclear, the PR waits. That is not punishment; it is how engineering team AI adoption stays calm instead of becoming a personality contest.

Train the workflow, not just the tool

Cursor Agent, Claude Code from Anthropic, and OpenAI Codex all have different surfaces. Your team still needs one shared review language across them.

Run a short practice session with a safe repo task: add validation to an API route, update the tests, and ask the agent to explain the affected files before editing. Then review the PR using the same checklist, even if the change is tiny.

The trap is making training a tour of features. Agentic coding training should teach when to stop the agent, when to narrow context, when to refuse a tool call, and when to ask for a smaller diff.

A good workshop outcome is not “everyone used the same assistant.” It is “everyone can review AI-assisted work the same way.”

Common questions

  • Our engineering team is split on using AI code assistants. How do we align them?

    Align the team around reviewable behavior, not personal preference. Pick one repo convention, one approved tool boundary, and one PR checklist for the next five AI-assisted changes. People can use different assistants, but the merge standard stays shared: small diffs, explicit tests, documented MCP/tool use, and human ownership of the final code.

  • What should an AI code review workflow for teams include?

    It should include repo rules, an agent boundary, test expectations, and a reviewer checklist. The minimum useful version is three files: .cursor/rules/ai-review.mdc, AGENTS.md, and a PR template. Add more process only when reviews show a real gap, such as unsafe MCP write access or repeated shallow generated tests.

  • Should we allow agents to use MCP servers in production workflows?

    Allow MCP only through narrow, documented tools with clear read/write permissions. As of June 2026, the safer default is read-only access for review context and explicit human approval for write-capable actions. If an MCP server affects production data, require deterministic tests, logs, and a human operator in the loop.

  • Do we need separate rules for Cursor, Claude Code, and Codex?

    Use shared rules for review standards and product-specific files for execution details. Keep durable policy in AGENTS.md, then mirror the relevant parts into Cursor rules, Claude Code memory, or Codex project instructions. This avoids three competing governance systems while still respecting how each coding agent reads context.

  • How do we keep this from slowing developers down?

    Make the checklist shorter than the argument it prevents. A six-line PR checklist is faster than debating whether an AI-generated test is trustworthy after a bug lands. The convention should protect high-risk areas, not require ceremony for every small refactor or docs-only change.

Further reading

Try it on one pull request

Add the convention to one active repo and require it for the next AI-assisted PR only. After review, keep the parts that caught real risk and delete the parts nobody used.

One methodology lens

One useful way to read this through our methodology is the Plan step: delegate first-pass decomposition and dependency mapping, review the sequencing and assumptions, and keep ownership of scope and priorities. If that split is still fuzzy, the workflow usually is too.

Related training topics

Related research

Continue through the research archive

Ready to start?

Transform how your team builds software.

Get in touch