head-to-head

MetricClaude Fable 5GPT-5.5
SWE-bench Verified95.0%82.6%
SWE-bench Pro80.3%58.6%
Terminal-Bench82.7% (TB2.0)
Input $ / 1M$10
Context1M
Open weightsNoNo
MakerAnthropicOpenAI

when to pick each

Pick Claude Fable 5 if

Mythos-class flagship for long-horizon agentic runs: the model to reach for when a task spans hours and hundreds of tool calls and has to actually finish.

Pick GPT-5.5 if

OpenAI's strongest agentic coder, with the deepest tooling and ecosystem breadth of the closed labs.

Full reviewsClaude Fable 5, decoded

Ranked on our AI Coding Leaderboard, updated 2026-07-03. Scores are confirmed against primary sources; prices are per 1M input tokens and can change.

Primary sources
  • AnthropicGENZ TECH — Claude Fable 5 returns — SWE-bench Verified 95.0% (vals.ai independent eval) is the highest confirmed score of any model. SWE-bench Pro 80.3% uses Anthropic's own scaffolding and is contested. Restored Jul 1, 2026 after a 20-day export-control suspension. Pricing $10/$50 per 1M.
  • OpenAIvals.ai — SWE-bench Verified (independent) — Verified score from vals.ai independent eval; Pro is OpenAI-reported (rivals flag possible memorization on Pro).
  • BenchmarkSWE-bench — the real-GitHub-issue benchmark