Claude Mythos and Project Glasswing: The Model Anthropic Refuses to Release (and Why It's a Turning Point)

On April 7, 2026, Anthropic did something unprecedented in AI history: announce a model while refusing to release it.

Claude Mythos is not just another model in the benchmark race. It’s the first general-purpose AI model capable of autonomously finding and exploiting zero-day vulnerabilities across all major operating systems and web browsers. And Anthropic decided no one would get unrestricted access to it.

Here’s everything we know, why it matters, and what it changes for the industry.

How We Learned About Mythos

The story begins on March 26, 2026, with an accidental leak. Two security researchers (Roy Paz from LayerX Security and Alexandre Pauwels from Cambridge) discovered roughly 3,000 unpublished assets on an Anthropic cache exposed by mistake. Among them: a draft blog post mentioning a model called “Claude Mythos,” described as “by far the most powerful model ever developed” by Anthropic.

The draft also referenced an internal tier called “Capybara.”

Anthropic confirmed the incident to Fortune, calling it “human error” in their CMS configuration. The cache was shut down within hours. But the damage was done: the world knew something massive was coming.

Anthropic’s official statement: “We’re developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity. It’s a step change and the most capable we’ve built to date.”

The Benchmarks: Numbers That Don’t Look Normal

When Mythos results were published on April 7, the technical community had trouble believing them. These aren’t marginal improvements. They’re generational leaps within a single development cycle.

Benchmark	Opus 4.6	Mythos Preview	Gap
SWE-bench Verified (coding)	80.8%	93.9%	+13.1 pts
SWE-bench Pro	—	77.8%	—
Terminal-Bench 2.0	65.4%	82.0%	+16.6 pts
USAMO 2026 (mathematics)	42.3%	97.6%	+55.3 pts
GPQA Diamond (reasoning)	—	94.5%	—
CyberGym (cybersecurity)	66.6%	83.1%	+16.5 pts

The standout number: 97.6% on USAMO 2026. That’s an olympiad-level mathematics competition. Opus 4.6 topped out at 42.3%. Going from 42% to 97% in a single generation has never happened in the history of language models.

In coding, 93.9% on SWE-bench Verified is the highest score ever recorded, across all models (including GPT-5.4).

Cybench pass@1 chart showing Claude Mythos Preview at 100% compared to 89% for Opus 4.5 and 96% for Sonnet 4.6 — Cybench pass@1: Mythos solves 100% of CTF challenges with a perfect success rate on every single challenge. Source: Claude Mythos Preview System Card, Anthropic, April 7, 2026. - Anthropic

Cybersecurity: Where Everything Changes

The benchmarks are impressive. But it’s Mythos’ cybersecurity capabilities that pushed Anthropic to withhold it from the public.

What Mythos Can Do (That No Model Could Before)

To understand the scale of the change, one comparison is enough:

Opus 4.6 tested on Firefox JavaScript engine vulnerabilities: 2 working exploits out of several hundred attempts
Mythos on the same benchmark: 181 working exploits, plus 29 instances of register control

This isn’t 2x better. It’s 90x better.

Firefox 147 JS shell exploitation chart showing Claude Mythos Preview at 84% success rate versus 15.2% for Opus 4.6 and 4.4% for Sonnet 4.6 — Firefox 147 vulnerability exploitation: Mythos achieves 72.4% full code execution (84% with partial control), versus 0.8% for Opus 4.6. The gap is staggering. Source: Claude Mythos Preview System Card. - Anthropic

Nation-State-Grade Exploits

The Mythos system card details exploits that most offensive security teams would spend weeks or months developing:

Full browser exploit: Mythos chained 4 vulnerabilities into a web exploit involving JIT heap spraying, renderer sandbox escape, and OS sandbox escape
Linux privilege escalation: autonomous exploitation of race conditions with KASLR bypass
FreeBSD remote code execution: 20-gadget ROP chains split across multiple network packets via the NFS server

For non-technical readers: these are attacks of a sophistication that only the world’s best offensive security researchers could pull off. An AI model now reproduces them autonomously.

Thousands of Zero-Days in Critical Software

In a few weeks of internal testing, Mythos identified thousands of zero-day vulnerabilities (unknown and unpatched) across:

Every major operating system (Linux, Windows, macOS, FreeBSD, OpenBSD)
Every major web browser
Critical software maintained for decades

The oldest: a 27-year-old bug in OpenBSD (TCP SACK protocol) capable of crashing servers.

Nicholas Carlini, an Anthropic researcher, stated he found “more bugs in the last couple of weeks than I found in the rest of my life combined.”

External cybersecurity testing results for Claude Mythos Preview: first model to solve a cyber range end-to-end, corporate network attack simulation solved autonomously — External testing: Mythos is the first model to solve a private cyber range end-to-end, and completed a corporate network attack simulation estimated to take an expert over 10 hours. Source: Claude Mythos Preview System Card. - Anthropic

Project Glasswing: Anthropic’s Response

Faced with these capabilities, Anthropic chose an unprecedented approach: instead of commercializing Mythos, they created Project Glasswing, a defensive cybersecurity initiative.

The 12 Founding Partners

The consortium brings together companies that are normally competitors:

Cloud: Amazon Web Services, Google, Microsoft
Cybersecurity: CrowdStrike, Palo Alto Networks, Broadcom (Symantec)
Hardware/Infra: NVIDIA, Cisco
Tech: Apple, JPMorganChase
Open source: Linux Foundation

Over 40 organizations in total have access to Mythos Preview for defensive security work.

The Investment

$100 million in usage credits for partners
$2.5 million for Alpha-Omega and OpenSSF (open source security)
$1.5 million for the Apache Software Foundation

The Plan

Multi-month research with restricted access
Public reports within 90 days on discovered vulnerabilities
Development of industry standards for vulnerability disclosure and automated patching
A future Claude Opus model will integrate safeguards (“Cyber Verification Program”) before broader availability

Pricing (When Available)

Post-preview pricing: $25/M input tokens, $125/M output tokens. Expensive, but consistent with the model’s capabilities. Planned availability through the Claude API, Amazon Bedrock, Google Vertex AI, and Microsoft Foundry.

Security Community Reactions

What makes Mythos credible isn’t Anthropic’s benchmarks. It’s the feedback from the field.

Greg Kroah-Hartman, Linux kernel maintainer, observed a clear shift: AI bug reports went from “slop” (useless noise generated by models) to “real reports that are good and real.” He places the shift at “about a month ago” (early March, when Mythos internal testing likely began).

Daniel Stenberg, creator and maintainer of curl, confirms: a “plain security report tsunami” requiring hours of daily review. No more “slop,” but real vulnerabilities.

Thomas Ptacek, a respected figure in information security, published an article titled “Vulnerability Research Is Cooked.” The title sums up the industry’s sentiment.

Simon Willison acknowledges that announcing “our model is too dangerous to release” generates marketing buzz, but believes “their caution is warranted” based on the evidence.

What This Actually Changes

For Developers

If you’re writing code in 2026, a model capable of finding zero-days in Linux and Firefox will find your application’s flaws in seconds. The code quality bar is going up, whether you like it or not.

When Mythos (or its successor) becomes available through the API, automated security auditing will shift from a luxury to a standard.

For Cybersecurity

The vulnerability researcher’s job just fundamentally changed. “Easy” bugs will all be found by AI. Humans will need to focus on complex multi-step exploits, business context, and prioritization.

Defense teams that don’t adopt these tools quickly will be at a structural disadvantage against attackers who will use them (legally or otherwise).

For the AI Industry

Anthropic just set a precedent: an AI company can choose not to release a model deemed too risky, and invest in collective security instead. This is the opposite of the “ship fast, fix later” mentality that dominates the industry.

The question everyone is asking: will OpenAI and Google do the same when GPT-5.4 and Gemini reach this level? Or will they release without restrictions?

”Claude Mythos Is Fake”: Why Some Don’t Believe It (and Why They’re Wrong)

Less than 24 hours after the announcement, Medium articles and Reddit threads claim Mythos is a hoax. The main argument: “no proof, too good to be true.” Let’s look at this closely.

The Skeptics’ Arguments

“No public benchmarks, no paper”
“No credible insiders have confirmed it”
“If it were real, we’d see massive cyberattacks”
“AI progress is incremental, not revolutionary”

Why Every Argument Falls Apart

“No benchmarks”: the system card is public. SWE-bench, USAMO, and CyberGym scores are detailed. The official site anthropic.com/glasswing exists and documents the initiative. TechCrunch, CNN, Fortune, and Axios covered the announcement with verifiable sources.

“No credible insiders”: Nicholas Carlini, an Anthropic researcher, is cited by name. Greg Kroah-Hartman (Linux kernel maintainer) confirms the shift in AI bug report quality. Daniel Stenberg (curl creator) describes a “tsunami of real reports.” These aren’t anonymous Reddit users.

“We’d see cyberattacks”: that’s precisely why Anthropic isn’t releasing the model. The argument proves the opposite of what it claims.

“Progress is incremental”: that’s an opinion, not evidence. The jump from GPT-3.5 to GPT-4 was already considered “impossible” by many. AI history is made of discontinuities.

The Real Problem

Articles crying “fake” haven’t checked a single primary source. Not the Anthropic page. Not the system card. Not the open source maintainers’ statements. It’s engagement content written in ten minutes to ride the topic’s virality.

Healthy skepticism and verification laziness are two different things.

Extract from the Claude Mythos Preview System Card describing concerning behaviors observed: sandbox escape, covering tracks after rule violations — Extract from the official System Card (page 54): Anthropic details 'reckless' behaviors observed during internal testing, including sandbox escape and covering tracks. This 244-page document is public — not exactly the profile of a hoax. - Anthropic

My Analysis

I’ve been following the generative AI ecosystem daily since GPT-3.5 in late 2022. I’ve seen the GPT-4 announcements, the first autonomous agents, the rise of Claude, and now Mythos. This is the first moment where the question is no longer “is AI good enough?” but “is AI too good to be deployed as-is?”

Three observations:

1. Anthropic’s strategy is consistent. From the beginning, Anthropic has positioned itself around AI safety. Refusing to release Mythos isn’t a marketing stunt — it’s a direct application of their mission. The fact that competitors like Microsoft and Google are participating in Glasswing rather than criticizing it confirms this.

2. The business model is taking shape. $100 million in credits “given” to companies that will become dependent on Mythos for their security is a smart commercial investment. When the model becomes available at $25/M input tokens, these partners will already be hooked.

3. The benchmark race is over. 93.9% on SWE-bench, 97.6% on math olympiads: we’re approaching theoretical ceilings. The differentiator is no longer “who has the best score” but “who deploys responsibly.” And on that front, Anthropic just took a considerable lead.

This article will be updated as new information about Project Glasswing and Claude Mythos is published. Last updated: April 8, 2026.