Preventing Malicious AI Agents through Secure Memory Management
Agent Memory Guard: Preventing AI Agents from Being Weaponized through Their Own Memory
The growing reliance on artificial intelligence (AI) agents has introduced a novel attack surface, as these agents often retain memory across sessions, storing conversation history, vector stores, and scratchpads.
This persistence allows attackers to plant malicious text in the wrong fields, overriding agent instructions, pulling out user data, or steering future tool calls.
Moreover, the effects of such attacks can survive across sessions due to the retained memory.
To Address This Vulnerability
Researchers have developed Agent Memory Guard, an open-source runtime defense layer designed to sit between an AI agent and its memory store.
This layer screens every read and write operation through a pipeline of detectors and a YAML policy.
The primary objective of Agent Memory Guard is to prevent AI agents from being weaponized through their own memory.
Agent Memory Guard Project
The Agent Memory Guard project is the OWASP reference implementation for ASI06, Memory Poisoning, which ranks among the OWASP Top 10 for Agentic Applications.
The guard operates based on five core detection categories:
- SHA-256 baselines, which flag out-of-band tampering with immutable keys.
- Built-in detectors that identify prompt injection markers, secret and PII leakage, protected-key modifications, and size anomalies.
- A YAML policy that maps each finding to an action: allow, redact, quarantine, or block.
Results
The benchmark tests the effectiveness of Agent Memory Guard, running 55 test cases through five detectors.
The recall rate comes in at 92.5%, precision at 100%, and the false positive rate at zero, with a median latency of 59 microseconds.
The results indicate that prompt injection and protected-key tampering both scored 100%, while sensitive data leakage reached 83% and size anomaly reached 80%.
In terms of evasion, the current rule-based detectors represent a first-layer defense, allowing for a defense-in-depth design where teams with higher threat models can layer additional detection on top of the open-source layer.
Although protected-key checks operate on the key path, making it difficult for attackers to bypass the rules, sensitive-data matching is more exposed to evasion techniques such as encoding through base64, character splitting, or homoglyphs.
Future Developments
Future developments include adaptive evasion testing via AgentThreatBench, which will add an evasion-aware payload set built with knowledge of the published rules.
Additionally, version 0.4.0 will introduce machine learning-based anomaly detection on semantic features, and version 0.3.0 will add a plugin interface for custom detectors that teams can use to keep out of the open YAML.
GitHub Copilot was utilized for boilerplate and scaffolding during the development process, but the critical intellectual contributions lie in identifying the attack surface, designing the defense, and validating it against a curated adversarial corpus.
Using Copilot for boilerplate is considered standard practice.
Availability
The Agent Memory Guard project is available for free on GitHub, providing a valuable resource for developers looking to enhance the security of their AI-powered applications.