For engineers picking a model
Claude Opus vs Sonnet vs Haiku: how to choose
Sonnet is the sensible default. It carries the best cost/capability balance for the bulk of real work, so start there. Route down to Haiku for cheap, high-volume, latency-sensitive tasks, and reach up to Opus only for the hardest reasoning and complex multi-step planning where answer quality clearly dominates cost.
Model choice is a cost-control lever, not a one-time setting. The skill is matching each task to the cheapest model that still gets it right, then routing between them inside one system.
The three models at a glance
| Model | Input $/1M | Output $/1M | Choose it when |
|---|---|---|---|
| Claude Haiku 4.5 | $1 | $5 | High-volume, latency-sensitive work: simple classification, extraction, and cheap tool calls where speed and unit cost matter most |
| Claude Sonnet 4.6 | $3 | $15 | The default workhorse: the best cost/capability balance for most coding, drafting, summarization, and routine agentic tasks |
| Claude Opus 4.8 | $5 | $25 | The hardest reasoning and complex multi-step agentic planning, where the quality of the answer dominates the cost of producing it |
Prices are per million tokens (input / output) and reflect public pricing verified mid-2026. Re-check the provider's pricing page before you rely on them.
How to choose
Don't start from "which model is best" — start from the task. Most decisions fall out of three questions:
1. How hard is the reasoning?
If the task is pattern-shaped — classify this, extract these fields, call this tool with these arguments — a small model handles it well, and you should not pay more for headroom you don't use. Haiku is built for exactly this. As the task needs more chained reasoning, sustained context, or judgment, move up to Sonnet. Reserve Opus for problems where a wrong answer is expensive and the path to a right one is genuinely hard.
2. How sensitive is it to latency and volume?
High-throughput, user-facing, or per-request work magnifies both speed and unit cost. There, the cheaper, faster model is often the better engineering choice even if a larger model is marginally more capable. Batch and background work tolerate a slower, stronger model.
3. Does a stronger model change the outcome?
This is the deciding question for Opus. If Sonnet already reaches the correct outcome reliably, Opus buys you nothing but a larger bill. Spend the extra only where it measurably moves quality on the tasks that matter — and confirm that with evals, not vibes.
Routing and cost
The strongest single lever is a routing pattern: use Opus only where it changes the outcome, and route sub-tasks down to Sonnet and Haiku. A planner step might run on Opus to decompose a hard problem, while the many smaller execution and extraction steps it spawns run on Sonnet or Haiku. You get the quality where it counts without paying Opus rates for the whole pipeline.
The cost spread makes this worth doing. On input, the ratio across the three models is 5:3:1 — Opus to Sonnet to Haiku. Put plainly, Haiku's input is about five times cheaper than Opus and three times cheaper than Sonnet; the output spread runs the same way ($25 / $15 / $5). In a multi-turn agent that re-sends a growing context every iteration, those input multiples compound fast, so pushing routine sub-tasks down a tier is one of the largest savings available to you.
- Anthropic — models and pricing documentation (platform.claude.com): per-million-token input/output rates for Claude Haiku 4.5 ($1 / $5), Sonnet 4.6 ($3 / $15), and Opus 4.8 ($5 / $25).
- Anthropic — guidance on model selection and routing (matching task difficulty to model tier; routing down for sub-tasks).
Prices were verified mid-2026 against Anthropic's pricing page; model pricing changes — verify the current rates before relying on them. Corrections: hello@aiarch.dev.
Learn to model cost and route models like an architect.
AI Architect Academy teaches model selection, routing, prompt caching, and cost-modeling as first-class skills — the levers that actually move the bill in production Claude systems.
Get notified when new tracks ship.