Chinese AI Firms Utilize 16 Million Claude Queries to Replicate Model
Large-Scale Attacks on Anthropic’s AI Model
A recent investigation by Anthropic, a leading AI research organization, has uncovered evidence of large-scale attacks by three Chinese AI firms aimed at extracting capabilities from its large language model, Claude.
Industrial-Scale Campaigns
The companies in question, DeepSeek, Moonshot AI, and MiniMax, allegedly used over 24,000 fraudulent accounts to generate more than 16 million queries, violating Anthropic’s terms of service and regional access restrictions.
The attacks, which Anthropic describes as “industrial-scale campaigns,” employed a technique known as distillation, where a less capable model is trained on the outputs generated by a more advanced AI system.
While distillation is a legitimate method for companies to develop smaller, more affordable versions of their own models, it is considered illicit when used by competitors to acquire capabilities from other AI companies at a fraction of the time and cost.
National Security Risks
Anthropic warns that models built through illicit distillation lack necessary safeguards, posing significant national security risks.
These unprotected capabilities can be weaponized by foreign AI companies to facilitate malicious activities, including cyber-related operations, disinformation campaigns, and mass surveillance.
Authoritarian governments can also leverage these capabilities to support military, intelligence, and surveillance systems.
Attribution of Attacks
The three distillation attacks were attributed to specific AI labs based on request metadata, IP address correlation, and infrastructure indicators.
- DeepSeek targeted Claude’s reasoning capabilities, rubric-based grading tasks, and sought to generate censorship-safe alternatives to politically sensitive queries.
- Moonshot AI focused on Claude’s agentic reasoning and tool use, coding capabilities, computer-use agent development, and computer vision.
- MiniMax targeted Claude’s agentic coding and tool use capabilities.
Countermeasures
Anthropic notes that the attacks relied on commercial proxy services that resell access to Claude and other frontier AI models at scale.
These services utilize “hydra cluster” architectures, which contain massive networks of fraudulent accounts to distribute traffic across their API.
The access is then used to generate large volumes of carefully crafted prompts designed to extract specific capabilities from the model for the purpose of training their own models.
To counter the threat, Anthropic has developed several classifiers and behavioral fingerprinting systems to identify suspicious distillation attack patterns in API traffic.
The company has also strengthened verification for educational accounts, security research programs, and startup organizations, and implemented enhanced safeguards to reduce the efficacy of model outputs for illicit distillation.
Related Incidents
The disclosure comes weeks after Google’s Threat Intelligence Group (GTIG) reported identifying and disrupting distillation and model extraction attacks aimed at Gemini’s reasoning capabilities through more than 100,000 prompts.
Google noted that model extraction and distillation attacks do not typically pose a risk to average users, but rather to model developers and service providers.
