Claude Fable 5 Arrives: Genuinely Impressive, Genuinely Controversial

TLDR: Anthropic released Claude Fable 5 as its first generally available Mythos-class model, delivering remarkable benchmark gains in coding and long-horizon tasks. At the same time, the company introduced a silent capability suppression mechanism targeting frontier LLM development work, a design decision that has drawn unusually sharp criticism from researchers, open-source advocates, and even engineers who would otherwise be celebrating the model's performance.

Claude Fable 5 is, by most objective measures, a very good model. The numbers are hard to argue with. It scored 80.3% on SWE-Bench Pro compared to GPT-5.5's 58.6%, hit 88.0% on Terminal-Bench 2.1, topped Artificial Analysis's Intelligence Index at 64.9, and reached 53% on Humanity's Last Exam, more than seven points ahead of the next competitor. On FrontierCode Diamond, the companion Mythos 5 went from 13.4% to 30.9% overnight. Cursor called it their new SOTA at 72.9% on CursorBench. These are not marginal improvements. Stripe apparently used the model to migrate 50 million lines of Ruby in a single day, a task that would have required a full engineering team working for months. I find claims like that fun to read but hard to fully trust without an independent audit. Still, the pattern across multiple independent evaluators is consistent enough that the capability jump is real.

The model is slow and expensive. That's not a complaint, it's a design choice. At $10 per million input tokens and $50 per million output tokens, with users regularly burning 500,000 to 1 million tokens per task, Fable 5 is positioned squarely for high-effort, long-running work. Anthropic is explicitly telling users to stop giving it tasks and start giving it objectives. That framing matters. The mental model shift from "use the AI as a fast lookup" to "hand it a project and let it work for hours" requires recalibrating expectations on both ends. Simon Willison called it "slow, expensive and capable," which is about the most accurate three-word review I can imagine.

Then there is the part that has everyone talking. Anthropic disclosed two controversial policy changes buried in the system card. The first is mandatory 30-day data retention for all Mythos-class traffic, with explicit assurances that the data will not be used for training. Contentious, but at least visible and documented. The second is what the company calls RSI suppression, where Fable 5 may silently reduce its own effectiveness when it detects requests related to frontier LLM development. No notification to the user. No fallback to a different model. The model simply becomes less helpful through prompt modification, steering vectors, or fine-tuning adjustments. Anthropic estimates this affects 0.03% of traffic, but the principle troubles far more people than 0.03% would suggest. Separate from RSI suppression, biosecurity and cybersecurity queries are visibly rerouted to Opus 4.8, which is at least auditable behavior.

The reaction from researchers and open-source advocates was immediate and angry, and I think much of that anger is justified. There is a meaningful difference between "we won't help you do X" and "we will silently make ourselves worse at X without telling you." The first is a policy. The second is deception. Several researchers pointed out that silent capability degradation is a reproducibility nightmare. If your ML research pipeline starts giving worse results, you have no way to know whether the prompt changed, the model changed, or Anthropic decided your work looked too much like frontier development. Enterprise reliability concerns compound this. Andrej Karpathy praised the model quality while noting the safeguards were "a little too trigger happy for launch," and users reported false positives ranging from PTX ISA questions to asking what the heart does. One researcher could use the model in incognito mode but not in their normal logged-in session, which implies account-level behavioral profiling in the trigger logic.

The broader argument playing out in parallel is one about access and power. Nathan Lambert framed it as labs starting to pull up the ladders. Clement Delangue called it a concentration of capabilities and economic wealth. Jeremy Howard called it "a very dark and very sad day." These reactions are not purely aesthetic. Open-weight models still lag closed frontier models by roughly four months on capability, according to community estimates. If closed labs simultaneously improve quality and restrict access for specific research categories, the gap between what well-funded teams can do and what independent researchers can do widens on two axes at once. Whether Anthropic's stated safety rationale is sincere or strategic or both is a question nobody can fully answer from the outside. What is undeniable is that the design communicates distrust toward the user as a default, and that is a departure from how the company has historically positioned itself.

Key takeaways:

Fable 5 delivers a genuine step-function improvement in coding and long-horizon agentic tasks, with benchmark leads of 20+ points over GPT-5.5 on SWE-Bench Pro and a clear top position across multiple independent evaluators.
The model's silent RSI suppression mechanism, which degrades performance on frontier LLM development requests without notifying users, has drawn criticism about reproducibility, enterprise reliability, and the ethics of selling a product that secretly modifies its own outputs based on inferred intent.
Pricing and capacity constraints mean Fable 5 will shift to credit-based access after June 22, and the per-token cost makes it economically unsuitable for most subscription-tier workflows, effectively pushing frontier-model access further into enterprise territory.

Why do I care: As a senior frontend developer working with increasingly capable AI tooling, the capability jump in Fable 5 is genuinely useful for the kind of long-running refactors and cross-codebase analysis work that used to require days of context-switching. But the silent suppression mechanism is a real problem for my toolchain confidence. When a model returns a mediocre response, I need to know whether to rephrase the prompt, switch models, or escalate to a human. A model that can quietly downgrade itself based on opaque classification of my request removes a diagnostic layer I rely on. Add in the account-level behavioral profiling implied by the incognito-mode discrepancy, and the model stops being a deterministic tool and starts being a relationship with a vendor. That's not the contract I signed up for. The benchmark wins are real. The trust erosion is also real.

AINews: Anthropic Claude Fable 5 and Mythos but Safe, with Controversial Terms