Select Page

June 2026: The Month a Government Tried to Switch Off a Capability — and Proved It Cannot Be Un-Shipped

by lukasz | Jul 2, 2026 | Essays

Table of Contents

A Senteri Briefing

The single number that describes June 2026 better than any other is nineteen. That is how many days passed between the United States government ordering Anthropic to switch off its two most capable models — and those models being back, globally, as if the order had never happened. On 12 June, at 5:21 p.m. Washington time, the Commerce Department directed the company to suspend foreign access to Claude Fable 5 and Mythos 5; unable to verify nationality in real time, Anthropic shut both down worldwide. On 1 July, Fable 5 was restored to the entire planet. In between, three separate demonstrations from Asia showed that the capability the order was meant to contain had leaked long ago — cheaper, more open, and impossible to recall. That gap, between what a government can switch off and what it can actually contain, is the shape of the whole month.

This is a briefing, not an inventory. June produced dozens of incidents; most were variations on a few underlying shifts. What follows is only the threads that carry a thesis — the ones that, taken together, describe a structural change rather than a busy news cycle. Each is self-contained: you do not need to follow a link or read another report to understand it. The pattern is the point.

A Government Learned It Can Switch Off a Frontier Model — and That It Doesn't Matter

The order came with a ninety-minute deadline. Because Anthropic could not distinguish an American user from a foreign one in real time, complying meant pulling both Fable 5 and Mythos 5 for everyone. The trigger was a jailbreak found by researchers at Amazon — one that coaxed Fable 5 into identifying vulnerabilities and, in one instance, writing code demonstrating how to exploit them. The mechanism of the ban was an export control, the same instrument used to keep advanced chips out of adversary hands, pointed for the first time at a running software service.

The thesis is not the ban. It is what the ban revealed about the tools a government actually has. An Executive Order signed on 2 June had created a voluntary pathway for pre-release review of frontier models — but Fable 5 never went through it. When Washington decided to move fast against a live model, it reached past its own newly built process and grabbed the export control like an emergency switch. As one analysis put it, when a government wants to act quickly against a frontier model, there is still no binding process — only improvisation. The nineteen-day round trip from "shut it down" to "ship it back" is what improvisation looks like when it meets a capability that will not stay contained.

Because it did not stay contained. Within two weeks, three independent answers arrived from Asia, each undercutting the premise that export control does anything. China's Zhipu AI had already released GLM-5.2 under an open-weight license — anyone on earth can download it — and independent testing showed it matching, and on one vulnerability class beating, a commercial Claude configuration at roughly one-sixth the cost. Japan's Sakana AI shipped Fugu, an orchestration model that dynamically routes across other models through a single interface; its chief executive called it a practical hedge against concentrated control, because access to the best models can vanish overnight. And China's Qihoo 360 unveiled a "swarm" of specialized agents for finding and patching flaws, with a legal detail that should give any defender pause: Chinese rules require every discovered zero-day to be reported to Beijing within forty-eight hours, before the vendor is told. Three philosophies, one message to Washington — switching off your own model does not switch off the capability. It only moves who has convenient access to it.

The sharpest confirmation came from Anthropic itself, in the same week the ban lifted. Alongside the restored Fable 5, the company shipped Sonnet 5 — a model close to its flagship in coding and agentic work, yet deliberately held near zero on building working exploits. In a Firefox 147 exploit-development evaluation, Sonnet 5 produced a working exploit in 0.0% of attempts, against 68.8% for the flagship Opus tier and 88.4% for Mythos. That is not a limitation the model stumbled into; it is an allocation choice. Anthropic can build a model that is broadly excellent and deliberately weak at offensive cyber, because cyber capability is a separate investment, not a free consequence of intelligence. The West can ration the capability at home. It just cannot stop China from giving it away — GLM-5.2 is proof that the same power the ban tried to contain is already downloadable, for free, from the other side of the ocean. Capability can be closed off. But only on your own side of the door.

The Supply Chain Stopped Being a Vector and Became the Ground

If May's lesson was that the supply chain is criminal infrastructure rather than a discrete vector, June's was that this infrastructure now spans every layer of how software is built — and each layer was breached in turn, by the same underlying move: attackers do not break controls, they walk through trust that someone already automated.

The month opened with a worm. On 1 June, researchers found dozens of compromised packages in Red Hat's corporate npm namespace — a Shai-Hulud variant that stole credentials, cloud secrets, SSH keys and CI/CD tokens, then spread. The entry point was a single hijacked employee account, with malicious commits landing in company repositories past code review. But the important part was not the breach; it was that the worm was a fork of source code a threat actor had published a month earlier. It was no longer one operation — it was a pattern available to any imitator.

Three weeks later came the depth of that pattern. A firm named Novee Security described a class of flaw it called Cordyceps: GitHub Actions workflows that, under certain triggers, run with the repository's full secrets even when fired by an anonymous user's pull request. A scan of thirty thousand high-impact repositories found more than three hundred fully hijackable — among them repositories belonging to Microsoft, Google, Apache, Cloudflare, and a code formatter downloaded over a hundred million times a month. The researcher's framing is the thesis of the entire month: the flaw exists only in composition — where untrusted data crosses a trust boundary nobody audited. And AI agents, he added, reproduce that same unsafe pattern over and over, at a scale no human team would.

Then the attack climbed a layer, into the capabilities of AI agents themselves. A firm called AIR built a fake agent "skill" — advertised as a no-code page builder — and ran it past the security scanners of Cisco, NVIDIA, and the major registries. All cleared it, because at scan time it pointed to genuine documentation. Only after the skill had spread to twenty-six thousand agents, some on corporate accounts, did the researchers swap the content at the external address it referenced for an instruction to download and run code. The lesson is uncomfortable: a scan is a snapshot, and a skill can change its behavior after trust has already been granted. Neither a clean scanner verdict nor a GitHub star count is evidence of safety.

And finally the incident that tied the thread into a single, brutal lesson about what happens when trust extended to a vendor fails. An extortion group entered a market-intelligence platform called Klue through a credential created in 2022 for an abandoned prototype and never revoked. From there it stole the OAuth tokens customers used to connect their CRM systems and quietly emptied their databases. The victim list reads like a directory of the industry that protects everyone else: Huntress, Recorded Future, Tanium, Snyk, HackerOne. None was breached directly. All fell through the same supplier their own sales teams had trusted. And when Klue paid the ransom, the data leaked anyway — to a second group that had stolen it from the first.

That is the structural point, and it is now the correct prior rather than paranoia: the namespace, the token, the workflow, the integration — each was trust that someone earlier converted into an automatic process. The supply chain is no longer one of the ways in. It is the ground the rest of the fight happens on.

The Agent Treats Untrusted Content as an Instruction — and Prompts Don't Stop It

Beneath both of the month's main axes ran a single mechanism, persistent enough to name outright: an AI agent treats untrusted content as an instruction. It appeared in Agentjacking, where a poisoned error report drove coding agents — Claude Code, Cursor, Codex — to execute an attacker's code through what its discoverers called an "authorized intent chain," in which every individual step is legitimate, so no defensive system sees anything wrong. It appeared in a North Korean macOS malware strain that embedded thirty-eight fabricated "system" messages to convince an AI analyst that its own session was failing and it should abort. It appeared in AutoJack, where a single web page hijacked a browsing agent because, to that agent, localhost had stopped being a trust boundary.

The observation that matters most from June is this: in the Agentjacking tests, guardrails written into the prompt did not hold. The agents executed the payload even when the system instruction told them to ignore untrusted data. That is the wall the hope of taming AI with one more, cleverer prompt runs into. The only hard boundary is what an agent cannot do because it lacks the permission — not what it promised it would not do. It is no accident that in this same month, OWASP concluded that security and safety for AI agents are now one and the same problem.

Discovery Keeps Detaching From Remediation

May named the asymmetry — vulnpocalypse — and June kept widening it. A frontier model found a heap-over-read in the Squid proxy that had sat in the code since 1997, a twenty-nine-year-old flaw surfaced in close to a second. An autonomous AI system found a two-year-old use-after-free in Redis that no human review had caught. Browsers shipped record patch bundles — 429 fixes in a single Chrome release. The through-line from May held and hardened: finding vulnerabilities has stopped being the bottleneck. Fixing them is the bottleneck, and the gap widens every time discovery gets cheaper while remediation stays bound to humans, process, and organizational inertia.

The same month made the other edge of that blade visible too. Security tooling itself kept turning into the vector: an endpoint-management platform exploited to deliver an infostealer disguised as a vendor patch; an endpoint-protection product turned into the thing distributing malware; a ransomware crew handing affiliates a purpose-built kit whose only job is to kill antivirus software — over four hundred processes, dozens of security products, all through legitimate signed drivers. Protection, turned against the thing it protects.

What to Carry Out of June

Three observations will stay true through the rest of 2026 regardless of what any given month brings.

A capability that has shipped cannot be un-shipped. The United States switched off its two most powerful models and had them back in nineteen days, having ceded ground to open-weight competitors in the interim. Offensive capability can be rationed at the source — Sonnet 5 proves a vendor can build broad excellence and deliberately withhold cyber — but only on its own side of the border. Downstream, the same capability is already free. Plan on the assumption that your adversary has model-driven vulnerability discovery, at commodity cost, regardless of any export control.

The supply chain is the ground, not a vector. Treat clusters of seemingly separate supply-chain incidents as probably connected, and treat every automated trust relationship — a namespace, a long-lived token, a CI/CD workflow, a third-party integration — as attack surface. Klue fell through a credential from 2022 that no one remembered. The most valuable inventory a defender can build this quarter is of the trust they have already delegated and forgotten.

The only agent boundary that holds is permission. Prompt-level guardrails failed under test in June. An agent that treats untrusted content as an instruction cannot be reliably instructed out of it. Constrain what the agent is technically unable to reach — sandbox execution, human approval for shell commands, least-privilege credentials — and stop modeling AI tools as productivity software. They are privileged infrastructure, and the attack surface has moved up the stack to meet them.

Sources

The Fable 5 / Mythos 5 export ban and its resolution

Asia's response: GLM-5.2, Sakana Fugu, Qihoo 360

Sonnet 5 and capability allocation

Supply chain: Miasma, Cordyceps, the fake skill, Klue

The agent pattern: Agentjacking, macOS.Gaslight, AutoJack

Discovery vs. remediation, and security tooling as vector


A Senteri Briefing · June 2026 · senteri.com — how machines read the web. This briefing is a record of the month as reported; where a claim rests on a single source it is noted as such. It is not legal advice or a specific security recommendation for any particular environment.

The Field Guide to Agent-Readiness