What is the difference between a workflow and an agent?

In a workflow, LLMs and tools are orchestrated through predefined code paths that the engineer lays out in advance. In an agent, the LLM directs its own process, deciding which tools to call and in what order until it judges the task done. Workflows are more predictable and cheaper; agents are more flexible for open-ended tasks whose steps you cannot predefine.

What are the main agentic design patterns?

The building block is the augmented LLM (retrieval, tools, memory). The five workflow patterns are prompt chaining, routing, parallelization (sectioning and voting), orchestrator-workers, and evaluator-optimizer. Beyond workflows sits the autonomous agent, which runs a bounded act-observe-decide loop.

When do I need a multi-agent system?

Reach for a multi-agent or orchestrator-workers setup only when a single workflow cannot express the work, typically when you cannot predict the subtasks in advance and a lead model must dynamically delegate and synthesize. Start simple; the added coordination brings extra cost, latency, and failure modes.

When should I use a fully autonomous agent instead of a workflow?

Use an autonomous agent only when the path cannot be predefined and the number of steps is open-ended, and when you can verify outcomes and bound the loop with turn and tool-call budgets plus a clear stop condition. For anything with a predictable sequence, a workflow pattern is cheaper and easier to debug.

For engineers designing agent systems

Agentic AI design patterns (and when to use each)

By Wibo · Amsterdam Published 26 Jun 2026 Last updated 26 Jun 2026 ~7 min read

Short answer

Most problems people call "agent" problems are solved by a simple workflow pattern, not a fully autonomous agent. Anthropic's Building Effective Agents draws the line clearly: workflows orchestrate LLMs and tools through predefined code paths, while agents let the model direct its own process and tool use dynamically.

The discipline is to start with the simplest thing that works and add agency only when the task genuinely needs it. More autonomy buys flexibility at the cost of latency, spend, and new failure modes. This guide covers the building block, the workflow vs agent distinction, the five workflow patterns, and the one case where you actually reach for an autonomous loop.

The augmented LLM

Every pattern is built from one unit: the augmented LLM, a model extended with retrieval, tools, and memory. Retrieval pulls in relevant context, tools let the model take actions or read external state, and memory carries information across steps. On its own this is already enough for many tasks. The patterns below are different ways of composing one or more augmented LLMs, so it pays to get this layer right first: clear tool definitions, scoped access, and a documented interface the model can actually use.

Workflows vs agents

The single most useful distinction is between workflows and agents. In a workflow, LLMs and tools are orchestrated through predefined code paths that you, the engineer, lay out in advance. The control flow is yours; the model fills in the steps. In an agent, the LLM directs its own process: it decides which tools to call and in what order, and it keeps going until it judges the task done. The control flow belongs to the model.

Workflows give you predictability, lower cost, and easier debugging. Agents give you flexibility for open-ended tasks where you cannot enumerate the steps ahead of time. Start with a workflow. Only move to an agent when the task's path genuinely cannot be predefined.

The five patterns

1. Prompt chaining

What it is: decompose a task into a fixed sequence of steps, where each LLM call works on the output of the previous one. You can add programmatic checks ("gates") between steps to catch errors early.

When to use: the task splits cleanly into subtasks that always run in the same order, such as draft then translate, or outline then expand.

Tradeoff: each added step trades latency for accuracy. If the steps do not have a fixed order, chaining is the wrong shape.

2. Routing

What it is: classify the input, then send it down a specialized path. A first LLM call (or classifier) decides the category; each category has its own prompt or tool set tuned for that kind of work.

When to use: inputs fall into distinct classes that are better handled separately, such as routing support tickets, or sending easy queries to a small model and hard ones to a large model.

Tradeoff: a wrong classification sends the input down the wrong path, so the router's accuracy caps the whole system.

3. Parallelization

What it is: run multiple LLM calls at once and aggregate. Two flavors: sectioning splits a task into independent subtasks run in parallel, and voting runs the same task several times to get diverse outputs you then combine.

When to use: subtasks are independent and benefit from parallel speed, or you want multiple attempts for confidence (voting), or separate concerns handled by focused calls (sectioning, e.g. one call for content and one for a safety check).

Tradeoff: more calls mean more spend, and you need a sound aggregation rule to combine results.

4. Orchestrator-workers

What it is: a lead LLM dynamically breaks the task into subtasks, delegates each to a worker LLM, and synthesizes their results. Unlike parallelization, the subtasks are not fixed in advance; the orchestrator decides them based on the input.

When to use: complex tasks where you cannot predict the subtasks ahead of time, such as making coordinated edits across many files, or research that fans out into varying numbers of subquestions.

Tradeoff: the dynamic decomposition adds cost and a coordination layer that can itself fail; it is more involved than a fixed parallel split.

5. Evaluator-optimizer

What it is: one LLM generates a response while a second LLM evaluates it and gives feedback, in a loop, until the output meets the bar or a limit is reached.

When to use: you have clear evaluation criteria and iterative refinement adds real value, such as literary translation or research where a critique loop measurably improves quality.

Tradeoff: the loop multiplies calls and latency; it only pays off when the evaluator's feedback is reliable and the criteria are explicit.

When you actually need an autonomous agent

An autonomous agent is the right tool only when the path cannot be predefined and the number of steps is genuinely open-ended. The agent runs a loop: act (call a tool or take a step), observe the tool results from the environment, then decide the next step, repeating until a stop condition is met. Critically, that loop must be bounded. Set explicit budgets on turns and tool calls, define a clear done-condition, and give the agent ground truth from the environment (tool outputs, test results) so it can self-correct rather than drift.

Agents shine on hard, open-ended problems where you trust the model's decision-making and can verify outcomes, the classic example being a coding agent that edits files, runs tests, and iterates. They are powerful precisely because they decide their own steps, which is also why they are harder to predict, cost more per task, and need guardrails you would not bother with for a workflow.

Pattern	What it does	Use when
Prompt chaining	Decomposes a task into a fixed sequence of LLM calls, each building on the last, with optional gates between steps	The task always runs as the same ordered subtasks
Routing	Classifies the input and sends it to a specialized path or model	Inputs fall into distinct classes best handled separately
Parallelization	Runs calls concurrently and aggregates: sectioning (independent subtasks) or voting (repeated attempts)	Subtasks are independent, or you want multiple attempts for confidence
Orchestrator-workers	A lead LLM dynamically splits the task, delegates to workers, and synthesizes results	You cannot predict the subtasks in advance
Evaluator-optimizer	One LLM generates, another critiques, looping until the output meets the bar	You have clear criteria and iteration measurably improves quality

Don't reach for autonomy by default

The common mistake is starting with a fully autonomous agent because it sounds powerful. Every increment of agency you hand to the model adds cost, latency, and new failure modes: runaway loops, wrong tool calls, and harder debugging. Pick the simplest pattern on the list that solves the task. Add agency only when a fixed code path genuinely cannot express the work, and even then bound the loop.

Sources & provenance

Anthropic — Building Effective Agents (the workflow vs agent distinction, the augmented LLM, and the five workflow patterns).
Course material: AI Architect Academy Track B (Agentic Systems) — the bounded agentic loop, budgets, and stop conditions.

This is a conceptual overview; specific API shapes change — verify against current provider docs before implementing. Corrections: hello@aiarch.dev.

Learn to design agent systems that ship.

AI Architect Academy teaches the workflow patterns, the bounded agentic loop, evals, cost-modeling, and safety as first-class skills — across Anthropic, AWS, and Cloudflare.

Browse the curriculum → Try a sample lesson