head-to-head
| Metric | Claude Fable 5 | Claude Opus 4.8 |
|---|---|---|
| SWE-bench Verified | 95.0% | ~86% |
| SWE-bench Pro | 80.3% | 69.2% |
| Terminal-Bench | — | ~82.7% (TB2.1) |
| Input $ / 1M | $10 | $5 |
| Context | 1M | 1M |
| Open weights | No | No |
| Maker | Anthropic | Anthropic |
when to pick each
Mythos-class flagship for long-horizon agentic runs: the model to reach for when a task spans hours and hundreds of tool calls and has to actually finish.
The hardest agentic refactors and long, autonomous multi-file tasks where every point of accuracy saves a human review cycle.
Full reviewsClaude Fable 5, decoded
Ranked on our AI Coding Leaderboard, updated 2026-07-03. Scores are confirmed against primary sources; prices are per 1M input tokens and can change.
- AnthropicGENZ TECH — Claude Fable 5 returns — SWE-bench Verified 95.0% (vals.ai independent eval) is the highest confirmed score of any model. SWE-bench Pro 80.3% uses Anthropic's own scaffolding and is contested. Restored Jul 1, 2026 after a 20-day export-control suspension. Pricing $10/$50 per 1M.
- AnthropicAnthropic — Claude Opus 4.8 — Anthropic-reported; independent evals (vals.ai) track within ~1 point.
- BenchmarkSWE-bench — the real-GitHub-issue benchmark