Select Page

Kept Handing Me to a Weaker One… Got Pulled Entirely

by lukasz | Jul 1, 2026 | Essays

Table of Contents

A Senteri follow-up. The question hasn't changed. The menu has.


On June 10, I published a field-report about Claude Fable 5 — how, the moment I brought it real defensive-security questions, it quietly handed me to a weaker model, Opus 4.8, and told me so. The thesis was narrow: the most capable public model wasn't, for my work, the most capable option. Use Opus directly, I concluded, or pursue vetted access to the unfenced version, Mythos 5.

Two days later, the question stopped being mine to answer.

What happened in between

On June 12, 2026, the U.S. government — citing national security authorities under export control law — directed Anthropic to suspend all access to Fable 5 and Mythos 5 for any foreign national, anywhere, including Anthropic's own non-U.S. employees.[1] Because nationality can't be verified in real time at the API layer, Anthropic didn't segment the block. It pulled both models for every customer, worldwide, regardless of role, regardless of use case.[1]

Anthropic's account of the trigger: the government believed it had found a way to bypass Fable 5's safeguards. The company says it reviewed a narrow, non-universal technique surfacing a handful of previously known, minor vulnerabilities — the kind of finding it argues other public models can reproduce too.[1] It disputed the proportionality of the response, called it a likely misunderstanding, and said it was working to restore access. It complied anyway, because the directive didn't leave room not to.

I am not going to relitigate whose account of the jailbreak is correct. I don't have access to the classified parts of that argument, and pretending otherwise would be its own kind of dishonesty. What I can do is describe what changed for someone in my position, because that part is just bookkeeping.

Where things stand, eighteen days later

As of today, June 30:

Fable 5 remains suspended for general use — consumers, API developers, Claude Code, every international subscriber. No restoration date has been announced.

Mythos 5 — the unfenced model, the one I pointed defenders toward at the end of the last piece — got a partial reprieve on June 27. The government authorized Anthropic to redeploy it, but only to a vetted cohort of U.S. organizations that operate or defend critical infrastructure, reportedly around a hundred of them, plus relevant U.S. government agencies. Everyone else, including the suspension itself, remains in force.[2]

So the door I described at the end of the last piece — apply for Glasswing if your work justifies it — is narrower than I made it sound. It was never wide. For non-U.S. defenders specifically, it is now close to shut.

I should say plainly where I sit in this, since the rest of the piece doesn't work without it: I am, by the directive's own definition, a foreign national. I was the day I wrote the first piece, too. It just didn't matter yet, because the model was generally available to everyone. It matters now.

The question that's left

Here's what that leaves, for practically everyone outside that hundred-organization list: not Fable-versus-Opus, the question the last piece was actually about, but Opus 4.8 versus Sonnet 5 — the same question, it turns out, that a lot of people who've never thought about cyber safety classifiers are asking today, because Anthropic shipped Sonnet 5 this morning at a fraction of Opus's price.

So I went and read the Sonnet 5 launch materials with the same question I brought to Fable: is this one usable for my work.

The answer is documented, not inferred. Anthropic states plainly that Sonnet 5 wasn't deliberately trained on cybersecurity tasks, and that on evaluations built to test dangerous cyber skills — developing software exploits, specifically — it performs substantially below Opus 4.8.[3] In the one disclosed benchmark, a vulnerability-discovery test run against Firefox in partnership with Mozilla, neither Sonnet model produced a working exploit, though Sonnet 5 showed a slightly higher partial-success rate than its predecessor — a difference Anthropic attributes to general intelligence gains rather than anything resembling cyber training.[3] Sonnet 5 ships with cyber safeguards on by default, the same tier as Opus 4.7 and 4.8, because Anthropic judged its overall cyber risk to be low.[3] And in a footnote easy to miss if you're not looking for it: Anthropic recommends Opus 4.8, not Sonnet 5, for cybersecurity work that needs reduced guardrails.[3]

That's not a reroute. Nobody redirects your session. It's just true, the way a spec sheet is true — and it points the same direction the reroute used to. If the work is defensive security, reach for Opus.

What's actually different this time

The last piece was about a company's product design choice — a classifier deciding, request by request, that a topic warranted a weaker model, and saying so. That's friction with a face on it. You could see the seam, name it, route around it.

This is a different kind of instrument. A government directive doesn't discriminate by request; it discriminates by who's asking, applied at a layer — citizenship — that an API can't read. The blunt result was that everyone lost the tool at once, attackers and defenders and people writing essays about AI safety alike, for eighteen days and counting, while the hundred organizations judged trustworthy enough to keep the unfenced version are picked by a process the rest of us can't see and didn't apply to.

I don't think that makes the directive wrong. I think it makes it a different shape of cost than the one the last piece was measuring, and worth naming as its own thing rather than folding into the same complaint. A safety classifier asks what are you asking. An export control directive asks who are you — and right now, for a defender outside the small list of approved organizations, the honest answer to "can I use the most capable public model for security work" hasn't changed since June 10. It's just a different model doing the declining.


Observed 12–30 June 2026. Mechanism and restoration details drawn from Anthropic's public statement on the suspension, subsequent reporting on the partial Mythos 5 restoration, and Anthropic's published materials for Claude Sonnet 5.

[1] Anthropic, "Statement on the US government directive to suspend access to Fable 5 and Mythos 5" — anthropic.com/news/fable-mythos-access [2] NBC News, "U.S. government gives Anthropic green light for limited re-release of Mythos 5," citing reporting that Fable 5 remains subject to the government's restrictions — nbcnews.com [3] Anthropic, "Introducing Claude Sonnet 5" — anthropic.com/news/claude-sonnet-5

The Field Guide to Agent-Readiness