How Attackers Can Exploit AI Vision Models with Subtle Image Manipulations
Critical Vulnerabilities in AI Vision Models
Researchers have discovered that attackers can manipulate vision-language models (VLM) using imperceptible image changes, allowing them to evade traditional security measures.
- Vulnerability allows attackers to embed malicious instructions that AI models will follow
- Specially crafted visual inputs can bypass traditional security measures
- Researchers found that small fonts, heavy blurring, and rotation can reduce attack success rates
- Machine learning algorithms can optimize away distortions and recover image readability for AI models
Vulnerability Details
The researchers applied bounded pixel-level perturbations to images that were already failing as attacks due to poor readability or safety refusals by the target model.
- Readability recovery occurs when an image becomes legible to the AI model despite being unclear to humans
- Refusal reduction happens when the AI model chooses to comply with the embedded instruction despite previously refusing to do so
Testing and Results
The researchers tested their findings on several popular VLMs, including GPT-4o and Claude.
- Claude showed the largest overall gain in attack success after optimization on heavily blurred images, jumping from 0% to 28%
- GPT-4o’s safety filter caught most of the newly readable content, limiting the overall attack gains
Related News
The U.S. government has launched a new initiative called “CI Fortify” to prepare critical infrastructure for geopolitical cyber conflict.
- The program aims to help organizations strengthen their cybersecurity posture and prepare for potential attacks
Microsoft has warned of a sophisticated phishing campaign targeting U.S. organizations.
- The campaign uses highly targeted emails that appear to come from trusted sources, making it difficult for recipients to distinguish between legitimate and malicious messages
A critical remote code execution vulnerability was patched in Android devices.
- The vulnerability allowed attackers to execute arbitrary code on vulnerable devices, potentially leading to unauthorized access to sensitive data
A breach was reported in the source code repository of cybersecurity firm Trellix.
- The breach exposed sensitive information and potentially compromised the security of customers
A supply chain attack was contained in the vendor says daemon tools software distribution network.
- The attack highlighted the risk of supply chain vulnerabilities in the technology industry
The latest edition of the daily briefing newsletter features expert insights on the current state of cybersecurity.
- Discussions include the role of agents in securing networks and the importance of visibility in preventing attacks
