head-to-head

MetricClaude Fable 5DeepSeek V4 Pro
SWE-bench Verified95.0%80.6%
SWE-bench Pro80.3%55.4%
Terminal-Bench67.9% (TB2.0)
Input $ / 1M$10$0.435
Context1M1M
Open weightsNoYes
MakerAnthropicDeepSeek

when to pick each

Pick Claude Fable 5 if

Mythos-class flagship for long-horizon agentic runs: the model to reach for when a task spans hours and hundreds of tool calls and has to actually finish.

Pick DeepSeek V4 Pro if

The cheapest frontier-class coder — top open-weights score at ~11× less than Opus. Best pick when cost or self-hosting rules.

Full reviewsClaude Fable 5, decoded

Ranked on our AI Coding Leaderboard, updated 2026-07-03. Scores are confirmed against primary sources; prices are per 1M input tokens and can change.

Primary sources
  • AnthropicGENZ TECH — Claude Fable 5 returns — SWE-bench Verified 95.0% (vals.ai independent eval) is the highest confirmed score of any model. SWE-bench Pro 80.3% uses Anthropic's own scaffolding and is contested. Restored Jul 1, 2026 after a 20-day export-control suspension. Pricing $10/$50 per 1M.
  • DeepSeekDeepSeek V4 — specs & benchmarks — Independent tracker (llm-stats, June 2026); tied with Gemini 3.1 Pro on Verified, ahead on Pro.
  • BenchmarkSWE-bench — the real-GitHub-issue benchmark