Anthropic released Claude Sonnet 5 on June 30, 2026, and the headline is not the raw score, it is the value: a mid-tier model that lands within about six points of the far pricier Opus 4.8 on the hardest agentic-coding benchmark, and actually beats it on some terminal tasks, while charging $2 per million input tokens against Opus 4.8's $5. Codenamed Fennec, it became the default model for every free and paid claude.ai user on day one, which tells you how confident Anthropic is that the gap between "cheap" and "frontier" has narrowed to a rounding error for most work.

  • Sonnet 5 scores 63.2% on SWE-bench Pro and 85.2% on SWE-bench Verified, up from Sonnet 4.6's 58.1%, closing much of the distance to Opus 4.8's 69.2%.
  • On Terminal-Bench 2.1 it hits 80.4%, the first time a Sonnet model has matched or beaten its Opus sibling on a major coding evaluation.
  • Introductory pricing is $2 input / $10 output per million tokens through August 31, 2026, then $3 / $15, against Opus 4.8 at $5 / $25.
  • It ships with a native 1M-token context window and is the default across Free, Pro, Max, Team, and Enterprise, plus Claude Code and the API.
Anthropic model tiers and where Sonnet 5 lands Sonnet 5 sits in the mid price tier but reaches performance close to the top-tier Opus 4.8, compressing the value ladder. PRICE (INPUT / 1M TOKENS) vs CAPABILITY Haiku 4.5 cheapest Sonnet 5 $2 / 1M in near-Opus quality 63.2% SWE-bench Pro Opus 4.8 $5 / 1M in 69.2% SWE Pro top tier Sonnet 5 pulls the mid tier up toward the frontier genztech.blog
Fig 1 Anthropic's ladder used to trade price for capability in big steps. Sonnet 5 compresses that: it costs 2.5x less than Opus 4.8 but sits close to it on coding and reasoning, which is why it is now the default model.

What did Anthropic actually ship on June 30?

Sonnet 5 is the successor to Sonnet 4.6 and the most agentic model in the Sonnet line so far. Anthropic frames it as near-Opus performance for developers at a fraction of the cost, and the positioning is deliberate: rather than push the absolute ceiling, this release attacks the price of getting frontier-class results. It launched simultaneously across claude.ai (as the new default for Free and Pro), the Claude Developer Platform, and Claude Code, with a native 1M-token context window and higher rate limits. The pitch to teams is simple: keep the workflow, cut the bill.

RelatedSubQ Claims the First Subquadratic Frontier LLM

How close does it really get to Opus 4.8?

On Anthropic's own numbers, remarkably close, and on one axis it pulls ahead. Sonnet 5 reports 85.2% on SWE-bench Verified, 63.2% on the harder SWE-bench Pro, 78.3% on SWE-bench Multilingual, and 81.2% on OSWorld-Verified computer use. On Terminal-Bench 2.1 it reaches 80.4%, a jump from Sonnet 4.6's 67.0% that puts it level with or above Opus 4.8 on that specific test, the first time a Sonnet has done that. On reasoning, Humanity's Last Exam with tools comes in at 57.4%, essentially matching Opus 4.8's 57.9%. The one genuine gap left is SWE-bench Pro, where Opus 4.8's 69.2% still leads by about six points on large multi-file diffs.

MetricClaude Sonnet 5Sonnet 4.6Opus 4.8
SWE-bench Verified85.2%~82%~86%
SWE-bench Pro63.2%58.1%69.2%
Terminal-Bench 2.180.4%67.0%~82.7%
Humanity's Last Exam (tools)57.4%lower57.9%
Input price / 1M$2 (intro)$3$5
Output price / 1M$10 (intro)$15$25

Two caveats keep this honest. Third-party harnesses report different figures: Cursor's production CursorBench lists Sonnet 5 at 61.2% against Opus 4.8 at 63.8%, so the exact gap depends on scaffold and settings. And Sonnet 5 uses an updated tokenizer, so the same text can map to 1.0 to 1.35x more tokens, which trims some of the headline price advantage on token-heavy workloads.

SWE-bench Pro agentic coding scores Sonnet 5 scores 63.2 percent on SWE-bench Pro, up from Sonnet 4.6 at 58.1 percent, closing on Opus 4.8 at 69.2 percent. 58.163.269.2 Sonnet 4.6Sonnet 5Opus 4.8 SWE-BENCH PRO (% RESOLVED) genztech.blog
Fig 2 · benchmark On SWE-bench Pro, Sonnet 5's 63.2% sits between last generation's 58.1% and Opus 4.8's 69.2%, but it costs 2.5x less than Opus per input token. Figures: Anthropic.

Why does making it the default matter so much?

Defaults decide usage. By putting Sonnet 5 in front of every free and paid user on launch day, Anthropic ensures that the median Claude conversation now runs on a model that is close to frontier quality, which raises the floor for hundreds of millions of interactions at once. For developers on the API, the calculation is starker: agent loops fire thousands of calls, so a 2.5x cut on input tokens against Opus is the difference between a workflow that pencils out and one that does not. This is the same downward pressure the whole industry is under, capability sliding into cheaper tiers, and Sonnet 5 is the clearest 2026 example of it.

RelatedMicrosoft's MAI Models Signal It Wants to Need OpenAI Less

Who should still reach for Opus?

Opus 4.8 remains the ceiling, and the six-point SWE-bench Pro lead is real for the hardest, largest refactors where every point of accuracy saves a human review cycle. Anthropic itself notes that at extra-high reasoning effort Sonnet 5 can cost more than Opus for similar quality, so the tiers are not a clean "always cheaper" story. The practical rule: use Sonnet 5 as the workhorse for the enormous middle of coding and knowledge tasks, and escalate to Opus for the sensitive, high-stakes diffs where the last few points matter more than the bill.

What to watch · 2026
  • The tokenizer tax. The new tokenizer inflates token counts 1.0 to 1.35x. Measure your real cost, not the sticker price.
  • Post-intro pricing. The $2/$10 rate ends August 31 and steps to $3/$15. That still undercuts Opus, but budget for it.
  • Third-party parity. Watch whether independent harnesses like CursorBench converge on Anthropic's numbers or keep showing a wider gap.
  • The IPO backdrop. Sonnet 5 lands as Anthropic races toward a public listing. A cheaper flagship is also a revenue-and-usage story.

Our take

Sonnet 5 is the most important model release of the summer precisely because it is not a moonshot. Anthropic looked at where the money actually goes, the billions of routine coding and reasoning calls, and shipped a model that does that work at close to Opus quality for less than half the price. That is a more consequential move than another point on a leaderboard, because it changes what teams can afford to automate. The gap to Opus on the very hardest tasks is real and worth respecting, and the tokenizer change means you should verify your own numbers rather than trust the headline. But as the new default for the entire Claude userbase, Sonnet 5 quietly resets the baseline for what "good enough" costs, and every competitor now has to answer it.

Primary sources

Original analysis by GenZTech. Figures current as of July 2026. Source: anthropic.com