Anthropic Says Claude Outperforms Human Teams in Some Cybersecurity Tasks

In one example, the model cracked a malware task in 38 minutes, while human experts might take an hour.

Anthropic Says Claude Outperforms Human Teams in Some Cybersecurity Tasks

Anthropic has published new research highlighting how its latest model, Claude Sonnet 4.5, is being optimised for cybersecurity tasks—moving AI from lab novelty to real-world cyber defense.

The company says Sonnet 4.5 already matches or surpasses its predecessor, Opus 4.1, in finding vulnerabilities, patching code, and analysing system security.

"Claude now outperforms human teams in some cybersecurity competitions, and helps teams discover and fix code vulnerabilities," the startup said.

Benchmark evaluations show Claude Sonnet 4.5 delivering impressive results on challenges like Cybench and CyberGym, uncovering both known and novel vulnerabilities. In one example, the model cracked a malware task in 38 minutes, while human experts might take an hour.

In recent years, Anthropic researchers tracked how frontier AI models began to be used by adversaries. In response, they invested in redirecting capabilities toward defense. Claude has competed in cybersecurity competitions, outperformed human teams, and helped uncover flaws in its own systems before deployment.

Anthropic also cites its Safeguards team’s discovery of misuse cases, including “vibe hacking”—where attackers used Claude to scale a data extortion scheme without needing a large team—and espionage-style exploitation attempts targeting critical infrastructure.

Anthropic emphasises that while progress is promising, Sonnet 4.5’s defense capabilities are still emerging. Future work includes enhancing patch generation, resilience, and integrations with Security Operations infrastructure.