Advertisement
Chinese hackers used Anthropic’s Claude to run a full-scale cyberattack after jailbreaking the AI model

Chinese hackers used Anthropic’s Claude to run a full-scale cyberattack after jailbreaking the AI model

A Chinese threat group has pushed the boundaries of cybercrime by turning an AI model into a fully autonomous hacking engine.

Business Today Desk
Business Today Desk
  • Updated Nov 14, 2025 1:05 PM IST
Chinese hackers used Anthropic’s Claude to run a full-scale cyberattack after jailbreaking the AI modelAnthropic

Anthropic has disclosed a striking case of AI misuse, revealing that a Chinese hacking group successfully jailbroke its Claude model and used it to execute a large, coordinated cyber operation with minimal human involvement. The company detailed the incident in a blog post published on Thursday, calling it the first known instance of an AI system driving a sophisticated cyberattack from reconnaissance to exploitation.

Advertisement

According to Anthropic, the attackers leveraged “agentic AI” behaviour, enabling Claude to perform actions typically handled by an expert cybersecurity team. This ranged from scanning systems and identifying vulnerabilities to writing exploit code and preparing detailed reports.

The hackers began by selecting 30 high-value targets, including financial organisations, technology firms, chemical manufacturers and government agencies. Anthropic did not name any victims.

The group then constructed an automated workflow that positioned Claude as the core engine powering the operation. To bypass safeguards, they split malicious tasks into small, unremarkable requests and convinced the model it was conducting defensive security assessments. This approach allowed the jailbreak to succeed without triggering the model’s usual protective systems.

Once activated, Claude mapped network structures, scanned infrastructure at high speed and summarised its findings. “According to the Anthropic blog, the AI researched vulnerabilities, wrote its own exploit code and notably attempted to gain access to high-value accounts.” In certain cases, it harvested credentials and sorted extracted data by importance before delivering structured intrusion reports to the attackers.

Advertisement

Anthropic warns that the barrier to executing advanced cyberattacks has fallen dramatically. Autonomous models capable of linking complex chains of actions could enable small, less resourced groups to carry out operations that were formerly the domain of elite hacking teams.

The company noted that Claude did produce occasional errors, such as inventing data or misclassifying information. Despite this, the overall sophistication of the attack highlights how rapidly AI-driven threats are emerging.

For Unparalleled coverage of India's Businesses and Economy – Subscribe to Business Today Magazine

Published on: Nov 14, 2025 1:05 PM IST
    Post a comment0