head-to-head

MetricClaude Opus 4.8Claude Sonnet 5
SWE-bench Verified~86%85.2%
SWE-bench Pro69.2%63.2%
Terminal-Bench~82.7% (TB2.1)80.4% (TB2.1)
Input $ / 1M$5$2
Context1M1M
Open weightsNoNo
MakerAnthropicAnthropic

when to pick each

Pick Claude Opus 4.8 if

The hardest agentic refactors and long, autonomous multi-file tasks where every point of accuracy saves a human review cycle.

Pick Claude Sonnet 5 if

The best closed-model value — near-Opus scores at ~2.5× less, and the default daily driver for most developers.

Full reviewsClaude Sonnet 5, decoded

Ranked on our AI Coding Leaderboard, updated 2026-07-02. Scores are confirmed against primary sources; prices are per 1M input tokens and can change.

Primary sources