AI Architect Academy

For engineers designing agent systems

Agentic AI design patterns (and when to use each)

Short answer

Most problems people call "agent" problems are solved by a simple workflow pattern, not a fully autonomous agent. Anthropic's Building Effective Agents draws the line clearly: workflows orchestrate LLMs and tools through predefined code paths, while agents let the model direct its own process and tool use dynamically.

The discipline is to start with the simplest thing that works and add agency only when the task genuinely needs it. More autonomy buys flexibility at the cost of latency, spend, and new failure modes. This guide covers the building block, the workflow vs agent distinction, the five workflow patterns, and the one case where you actually reach for an autonomous loop.

The augmented LLM

Every pattern is built from one unit: the augmented LLM, a model extended with retrieval, tools, and memory. Retrieval pulls in relevant context, tools let the model take actions or read external state, and memory carries information across steps. On its own this is already enough for many tasks. The patterns below are different ways of composing one or more augmented LLMs, so it pays to get this layer right first: clear tool definitions, scoped access, and a documented interface the model can actually use.

Workflows vs agents

The single most useful distinction is between workflows and agents. In a workflow, LLMs and tools are orchestrated through predefined code paths that you, the engineer, lay out in advance. The control flow is yours; the model fills in the steps. In an agent, the LLM directs its own process: it decides which tools to call and in what order, and it keeps going until it judges the task done. The control flow belongs to the model.

Workflows give you predictability, lower cost, and easier debugging. Agents give you flexibility for open-ended tasks where you cannot enumerate the steps ahead of time. Start with a workflow. Only move to an agent when the task's path genuinely cannot be predefined.

The five patterns

1. Prompt chaining

What it is: decompose a task into a fixed sequence of steps, where each LLM call works on the output of the previous one. You can add programmatic checks ("gates") between steps to catch errors early.

When to use: the task splits cleanly into subtasks that always run in the same order, such as draft then translate, or outline then expand.

Tradeoff: each added step trades latency for accuracy. If the steps do not have a fixed order, chaining is the wrong shape.

2. Routing

What it is: classify the input, then send it down a specialized path. A first LLM call (or classifier) decides the category; each category has its own prompt or tool set tuned for that kind of work.

When to use: inputs fall into distinct classes that are better handled separately, such as routing support tickets, or sending easy queries to a small model and hard ones to a large model.

Tradeoff: a wrong classification sends the input down the wrong path, so the router's accuracy caps the whole system.

3. Parallelization

What it is: run multiple LLM calls at once and aggregate. Two flavors: sectioning splits a task into independent subtasks run in parallel, and voting runs the same task several times to get diverse outputs you then combine.

When to use: subtasks are independent and benefit from parallel speed, or you want multiple attempts for confidence (voting), or separate concerns handled by focused calls (sectioning, e.g. one call for content and one for a safety check).

Tradeoff: more calls mean more spend, and you need a sound aggregation rule to combine results.

4. Orchestrator-workers

What it is: a lead LLM dynamically breaks the task into subtasks, delegates each to a worker LLM, and synthesizes their results. Unlike parallelization, the subtasks are not fixed in advance; the orchestrator decides them based on the input.

When to use: complex tasks where you cannot predict the subtasks ahead of time, such as making coordinated edits across many files, or research that fans out into varying numbers of subquestions.

Tradeoff: the dynamic decomposition adds cost and a coordination layer that can itself fail; it is more involved than a fixed parallel split.

5. Evaluator-optimizer

What it is: one LLM generates a response while a second LLM evaluates it and gives feedback, in a loop, until the output meets the bar or a limit is reached.

When to use: you have clear evaluation criteria and iterative refinement adds real value, such as literary translation or research where a critique loop measurably improves quality.

Tradeoff: the loop multiplies calls and latency; it only pays off when the evaluator's feedback is reliable and the criteria are explicit.

When you actually need an autonomous agent

An autonomous agent is the right tool only when the path cannot be predefined and the number of steps is genuinely open-ended. The agent runs a loop: act (call a tool or take a step), observe the tool results from the environment, then decide the next step, repeating until a stop condition is met. Critically, that loop must be bounded. Set explicit budgets on turns and tool calls, define a clear done-condition, and give the agent ground truth from the environment (tool outputs, test results) so it can self-correct rather than drift.

Agents shine on hard, open-ended problems where you trust the model's decision-making and can verify outcomes, the classic example being a coding agent that edits files, runs tests, and iterates. They are powerful precisely because they decide their own steps, which is also why they are harder to predict, cost more per task, and need guardrails you would not bother with for a workflow.

PatternWhat it doesUse when
Prompt chainingDecomposes a task into a fixed sequence of LLM calls, each building on the last, with optional gates between stepsThe task always runs as the same ordered subtasks
RoutingClassifies the input and sends it to a specialized path or modelInputs fall into distinct classes best handled separately
ParallelizationRuns calls concurrently and aggregates: sectioning (independent subtasks) or voting (repeated attempts)Subtasks are independent, or you want multiple attempts for confidence
Orchestrator-workersA lead LLM dynamically splits the task, delegates to workers, and synthesizes resultsYou cannot predict the subtasks in advance
Evaluator-optimizerOne LLM generates, another critiques, looping until the output meets the barYou have clear criteria and iteration measurably improves quality
Don't reach for autonomy by default
The common mistake is starting with a fully autonomous agent because it sounds powerful. Every increment of agency you hand to the model adds cost, latency, and new failure modes: runaway loops, wrong tool calls, and harder debugging. Pick the simplest pattern on the list that solves the task. Add agency only when a fixed code path genuinely cannot express the work, and even then bound the loop.
Sources & provenance
  • Anthropic — Building Effective Agents (the workflow vs agent distinction, the augmented LLM, and the five workflow patterns).
  • Course material: AI Architect Academy Track B (Agentic Systems) — the bounded agentic loop, budgets, and stop conditions.

This is a conceptual overview; specific API shapes change — verify against current provider docs before implementing. Corrections: hello@aiarch.dev.

Learn to design agent systems that ship.

AI Architect Academy teaches the workflow patterns, the bounded agentic loop, evals, cost-modeling, and safety as first-class skills — across Anthropic, AWS, and Cloudflare.