Back to Research

Functional Programming for Coding Agents

Why explicit state and small steps make coding agents easier to trust.

Editorial illustration for Functional Programming for Coding Agents. That is a clue.
Rogier MullerApril 17, 20265 min read

Boris Cherny, creator of Claude Code, said the technical book that had the biggest impact on him as an engineer was Functional Programming in Scala. That is a clue.

The useful part is not Scala itself. It is the habit of making behavior explicit, keeping side effects contained, and breaking work into small composable units. Those ideas fit agentic coding because agents fail in familiar ways: they drift, change too much at once, and get hard to verify when the work is tangled.

That is why functional programming keeps coming up in serious discussions about coding tools. Not because every agent should be written in a functional language, but because the workflow problems are the same ones functional design tries to reduce.

What carries over

The strongest overlap is predictability. An agent is easier to trust when inputs, outputs, and state changes are narrow and visible. In practice, that means:

  • Prefer small tasks over broad prompts.
  • Keep transformations separate from execution.
  • Make state changes explicit instead of hidden in a long chain of tool calls.
  • Treat verification as part of the work.

This is less about ideology and more about failure containment. If an agent edits five files, runs three commands, and changes its own plan at the same time, review gets expensive. If it produces one patch, one test run, and one short summary, the human can inspect the result.

That pattern is close to functional thinking: isolate effects, compose simple steps, and keep the boundary between reasoning and side effects clear.

Why this matters for agentic coding

Agent tools are strongest when the task can be broken into bounded decisions. They are weakest when the task depends on hidden context, ambiguous state, or broad architectural judgment.

Functional ideas help because they encourage a narrower contract between the human and the agent. The human defines the shape of the work. The agent fills in the details. The more stable that contract is, the less the tool needs to improvise.

In practice, that often looks like this:

  1. Define a single change target.
  2. Give the agent only the files or modules it needs.
  3. Ask for a minimal patch first.
  4. Require a test or check that proves the change.
  5. Review the diff before asking for expansion.

That sequence is not novel, but it is durable. It works across IDE agents, CLI agents, and mixed human-agent loops because it reduces hidden state.

Where the analogy breaks

Functional programming is not a magic template for agent behavior. Real codebases are messy. Side effects exist. Build systems, APIs, and product constraints do not disappear because a workflow is elegant.

There are also limits to decomposition. Some tasks need broad context from the start: refactors that cross subsystem boundaries, debugging production-only failures, or changes that depend on product intent rather than local code structure. In those cases, too much fragmentation can slow the work down.

There is another tradeoff: strict boundaries can make an agent feel less flexible. If every task is forced into tiny pure steps, the loop can become bureaucratic. Teams need a balance between control and speed.

So the practical question is not “Can we make this purely functional?” It is “Where does the workflow benefit from functional discipline, and where does it need room for side effects?”

A practical way to apply it

If you are designing agent workflows for a team, start with the parts that are easiest to verify.

Use functional-style discipline in three places:

  • Planning: ask for a short plan with explicit inputs, outputs, and risks.
  • Editing: constrain the agent to one coherent change set.
  • Testing: require a concrete check that matches the change.

This gives you a cleaner review loop. It also makes it easier to compare agent output across tools, because you are evaluating the workflow shape rather than the brand of the tool.

A useful rule: if you cannot describe the change in one sentence, the agent probably should not start coding yet.

What teams should watch for

The main failure mode is overconfidence. A workflow can look disciplined while still hiding complexity. For example, a small patch may be easy to review but still introduce a subtle behavior change. Likewise, a neat decomposition can hide the fact that the agent never understood the real problem.

Teams should watch for:

  • patches that are small but semantically risky,
  • tests that pass but do not cover the actual behavior,
  • plans that sound structured but avoid the hard part,
  • repeated rework because the task boundary was wrong.

That is why the human review step still matters. Functional discipline helps the agent produce cleaner artifacts, but it does not replace judgment.

The useful takeaway

The point of Cherny’s reference is not that every coding agent should be “functional.” It is that the best tools tend to reward the same habits that functional programming rewards: explicit state, small units, and clear boundaries.

If your agent workflow feels brittle, the fix may not be a better model. It may be a better shape for the work.

A small step from our methodology fits here: Design the task boundary before you ask the agent to build. That usually saves more time than trying to recover from a vague prompt later.

The practical test is simple. If the agent can make one change, prove it, and stop, the workflow is probably on the right track. If it needs constant correction, the boundary is too loose.

Related research

Ready to start?

Transform how your team builds software today.

Get in touch