FORTUNE.COM
Claude Mythos

Leaked documents from Anthropic show new model 'poses unprecedented cybersecurity risks'

SUMMARY

Leaked internal Anthropic documents confirm that a new generation of super-strong AI models codenamed Claude Mythos is already in active testing with early access customers.

The company’s own draft blog post describes the model as a step change in performance and by far the most powerful AI it has ever developed. It is also referred to internally as Capybara, a new tier larger and more intelligent than previous Opus models.

Anthropic privately assesses that Claude Mythos poses unprecedented cybersecurity risks. The document states the model is currently far ahead of any other AI in cyber capabilities and presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.

Because of these risks Anthropic is adopting extra caution before wider release. The plan focuses on sharing the system first with select organizations so cyber defenders can strengthen codebases against the coming wave of AI-driven attacks.

The sensitive materials were left in an unsecured publicly accessible data cache containing nearly 3,000 unpublished assets. The exposure resulted from a configuration error in the company’s content management system.

After being notified Anthropic immediately secured the data store.

This leak reveals the company’s internal acknowledgment that its latest frontier model could enable large-scale cyberattacks far beyond previous systems.


▶︎ Click here for more breaking news