head-to-head

MetricClaude Sonnet 5Gemini 3.1 Pro
SWE-bench Verified85.2%80.6%
SWE-bench Pro63.2%54.2%
Terminal-Bench80.4% (TB2.1)
Input $ / 1M$2
Context1M
Open weightsNoNo
MakerAnthropicGoogle DeepMind

when to pick each

Pick Claude Sonnet 5 if

The best closed-model value — near-Opus scores at ~2.5× less, and the default daily driver for most developers.

Pick Gemini 3.1 Pro if

Google's strongest coding model today, with deep Workspace/Cloud integration. (A 3.5 Pro is expected but not shipped.)

Full reviewsClaude Sonnet 5, decoded

Ranked on our AI Coding Leaderboard, updated 2026-07-02. Scores are confirmed against primary sources; prices are per 1M input tokens and can change.

Primary sources