Anthropic’s Most Dangerous AI Model Just Fell Into the Wrong Hands

A Discord group has had access to the Mythos model for two weeks.

A small group of unauthorized users has accessed Anthropic’s Mythos AI model, a powerful cybersecurity tool deemed potentially dangerous. Bloomberg reports that this access involved a third-party contractor for Anthropic, who revealed that the group got into Mythos using a combination of the contractor’s access and common internet sleuthing tools.

The Claude Mythos Preview is a new general-purpose model capable of identifying and exploiting vulnerabilities in major operating systems and web browsers. Official access is restricted to a few companies under the Project Glasswing initiative, including Nvidia, Google, AWS, Apple, and Microsoft, with governments also interested in the technology. Anthropic has no plans to make the model public due to potential weaponization concerns.

Anthropic is investigating claims of unauthorized access via a third-party vendor environment. Currently, there is no evidence of impacts beyond the vendor’s environment. The illicit access reportedly occurred on April 7th, the same day Mythos was released to select companies for testing. The group remains unidentified but is part of a Discord channel focused on unreleased AI models.

The group used knowledge from a recent Mercor data breach to guess the model’s online location. Regular use of Mythos has occurred since access, shown through screenshots and a live demonstration, though not for cybersecurity tasks to avoid detection by Anthropic. Other unreleased Anthropic AI models were also reportedly accessed.