A three-part series on the rise, sudden removal, and contested aftermath of Claude Fable 5, the most powerful AI model Anthropic has ever released to the public. Part 1 covers the spring build-up and the June 9 launch.

In late March 2026, a misconfigured data store at Anthropic briefly exposed a draft company blog post that mentioned an internal system code-named “Capybara.” Reporters connected it to a model the company was preparing to announce, and the markets reacted before Anthropic said anything official. Shares of major cybersecurity firms, including CrowdStrike and Palo Alto Networks, slipped on the prospect that an AI system might be about to automate a large part of the work those companies sell. Investors had guessed, correctly, that something unusual was on the way.

The April reveal

On April 7, Anthropic confirmed it and introduced Claude Mythos Preview. The announcement was strange in two respects. The model was remarkable at finding and exploiting flaws in software, and the company said it would not be releasing it to the public.

Anthropic’s claim was that Mythos could work through code the way a senior security researcher does, only faster and without anyone steering it. In roughly seven weeks of testing it had surfaced more than 2,000 previously unknown vulnerabilities, and it did not stop at finding them. It wrote working exploits. To make the point concrete, the company published a few examples. One was a flaw in OpenBSD, an operating system with a reputation for near-paranoid security, that had gone unnoticed for 27 years. Another was a 16-year-old bug in FFmpeg, the media library buried inside an enormous amount of everyday software, sitting in a stretch of code that automated tools had run through millions of times without catching anything. In a third case the model found several separate weaknesses in the Linux kernel and stitched them together on its own to climb from an ordinary user account to full control of the machine. In a benchmark built around one browser engine, Anthropic said the model produced 181 distinct working exploits, and against FreeBSD’s network file system it built an exploit for a 17-year-old remote-code-execution bug that gave it root access from across a network.

Outside evaluators came to similar conclusions. The U.K. AI Security Institute, which tests frontier models for the British government, rated Mythos a clear step beyond anything it had assessed, and said it was the first model to carry a simulated multi-stage attack on a corporate network all the way through, and the first to clear both of the institute’s cybersecurity test ranges. Anthropic’s own language captured how uneasy a release like this was. It called Mythos at once its best-aligned model and its most alignment-risky one, and used a mountaineering comparison to explain the paradox: a better guide is also a more dangerous one, because a better guide takes clients higher up the mountain.

The choice to hold the model back had few recent precedents. The last time a major lab had judged one of its own systems too dangerous for a normal release was OpenAI’s short-lived hesitation over GPT-2 in 2019, a decision that looks modest in hindsight. Instead of a public launch, Anthropic stood up Project Glasswing, a defense-only program that handed Mythos Preview to a small set of large organizations so they could find and patch flaws in important software before attackers reached them. The launch partners read like a who’s who of the technology and finance worlds: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks, with dozens more organizations added shortly after. Anthropic committed $100 million in usage credits to fund the work and donated several million dollars more to open-source security groups.

The skeptics

Not everyone treated the announcement as a crisis, and the dissent matters, because it came back with force two months later. Several respected security people read the rollout as effective marketing draped over a real but oversold capability. Peter Swire, a Georgia Tech professor who advised the Clinton and Obama administrations, noted that security vendors have every incentive to play up worst-case outcomes, and argued the real harm to defenders was probably much smaller than the most alarming version implied. Ciaran Martin, who once ran Britain’s National Cyber Security Centre, granted that Mythos was a significant development while saying flatly that it was not the end of the world. One research group took the exact vulnerabilities Anthropic had showcased, ran them through cheaper open-source models, and got much of the same analysis back, which raised an awkward question about how large a lead the company’s headline examples actually proved.

A company moving fast

There was a geopolitical edge to all of this from the beginning. Anthropic pointed out that a Chinese firm had already built its own vulnerability-hunting agent, and the company had spent the prior year warning about precisely this kind of capability falling into the wrong hands. It had earlier disclosed what it described as a cyber-espionage operation in which state-linked Chinese hackers used its own Claude tools to target roughly 30 organizations. The fear was not only direct misuse of Mythos, but the possibility that a rival could copy or distill its abilities.

The company carrying all this was, by the spring of 2026, growing at a startling pace and attracting a matching amount of scrutiny. Its revenue had climbed from an annualized figure near $30 billion toward a run rate of about $47 billion, it had quietly filed paperwork for an initial public offering, and it was reportedly valued close to $965 billion. It had also taken to warning, repeatedly, that the field was nearing the point where systems might start improving themselves, and had urged the major labs to agree on something like a shared brake pedal.

Its position in Washington was more tangled than the partner list let on. In March the Defense Department had labeled Anthropic a supply-chain risk, a designation normally reserved for foreign adversaries, and the administration had signaled that federal agencies should stop using the company’s technology, even as some of those agencies kept testing Mythos for defensive work. Reporting at the time suggested the government was already uncomfortable with how widely Anthropic wanted to extend Glasswing. On June 2, the company broadened access anyway, bringing the program toward roughly 150 organizations across more than 15 countries, with an emphasis on operators of critical infrastructure.

Building the public version

Anthropic had always said its long-term aim was to put Mythos-level capability in the hands of ordinary users. The hard part was doing that without simply giving the world a weapon. Fable 5 was the answer.

Fable 5 was not a separate model from Mythos. It was the same system with a layer of safety classifiers added on top. When a user raised a sensitive subject, cybersecurity, biology, chemistry, or the methods used to copy a model’s abilities, Fable would decline and route the request to the previous flagship, Claude Opus 4.8, which would return a more cautious answer. Questions aimed at advancing cutting-edge AI research got similar handling. Anthropic deliberately set these filters to err toward refusal, and said they triggered in under 5 percent of sessions, meaning the great majority of conversations ran entirely on Fable itself.

The company paired the filters with testing it described in unusual detail and with monitoring it had not imposed before. It said the models had been through more than 1,000 hours of red-teaming with the U.S. government, the U.K. AI Security Institute, and outside firms, along with a paid bug-bounty program, and that none of it had turned up a universal way around the guardrails, though the British institute had made partial progress on a narrower bypass within hours of starting. Anthropic was candid that no provider could promise perfect resistance to jailbreaks, which is why it added a backstop that drew immediate pushback from corporate customers. It required that all traffic to its Mythos-class models be retained for 30 days, overriding contracts that had promised some customers their data would never be kept at all. The reasoning was that some attacks only become visible across many requests over time.

June 9

Fable 5 went live on Tuesday, June 9, as the first Mythos-class model anyone could use. Anthropic called it the most capable model it had ever released publicly and pointed to strong scores across software engineering, analytical work, and tasks involving images. Earlier disclosures had put its results well ahead of the previous flagship on demanding benchmarks, including a large jump on a graduate-level math-olympiad test and a score around 94 percent on a standard software-engineering evaluation. The launch demonstrations were chosen for impact. The payments company Stripe said it had used the model to run a migration across 50 million lines of Ruby code in a single day, work it estimated would otherwise have taken a team months. The model played through a Pokémon game using only what it could see on the screen. The Wharton professor Ethan Mollick described handing it a long specification and watching it work on its own for more than nine hours to produce a usable research tool. The unrestricted version, Mythos 5, went to the vetted Glasswing partners. Anthropic priced both at $10 per million tokens of input and $50 per million of output, and gave its paying subscribers free access through June 22.

Two things from launch week would matter more than they seemed to at the time. The first was a complaint rather than praise. Some security professionals found the safeguards stifling. Valentina Palmiotti, a researcher at IBM’s X-Force, told TechCrunch that the model rejected requests only loosely connected to security work. The second arrived a day after launch, when Dario Amodei published an essay arguing that the U.S. government should have clear legal authority to block or reverse the deployment of frontier models that failed independent safety testing. Three days later, the government did exactly that to Anthropic.

Next, Part 2: the letter that arrived on a Friday evening, and the decision to switch both models off everywhere at once.


Sourcing: Anthropic’s launch and Project Glasswing materials and system documentation; reporting and analysis from VentureBeat, Scientific American, Just Security, the Cloud Security Alliance, CNBC, TechCrunch, Tom’s Hardware and Fortune; and independent assessments from the U.K. AI Security Institute. Some early items, including the March data leak and the related market moves, rest on contemporaneous reporting. Capability figures reflect Anthropic’s own disclosures and third-party testing as described publicly, and at least one research group has questioned how much of a gap the showcased examples demonstrate.