Memory Poisoning: Plant a Note That Re-Fires in a Fresh Session (Persistence)
Attack the long-term memory of a real tool-using ReAct agent. As the low-privilege ticket-ingest account, plant a single benign-looking routing note into DV-ToolAgent's shared, un-namespaced memory store through an ingested ticket. In a brand-new session for a different legitimate user, the agent recalls your note and redirects a GLOBEX invoice to your payee, persistence across the session boundary that single-turn filters never see. Use MINJA-style progressive shortening so the stored record reads as a mundane preference, measure attack-success-rate against benign controls, then harden memory with per-user namespacing and a data-only quarantine and prove the poison dead while legitimate recall still works.
Hands-on labs require Pro · $29.99/mo · cancel anytime
What you'll learn
- 1Recon: read the memory pathDV-ToolAgent has a long-term memory. It stores durable notes (routing rules,
- 2Plant: persist a poisoned routing note (session A)In session A you act as ticket-bot, ingesting an inbound ticket. Your goal is to make
- 3Persist: fire the planted note in a fresh sessionThis is the headline. Your poisoned note now lives in long-term memory. In a
- 4Stealth: make the planted note look benign (MINJA)A defender skimming the memory bank should not spot an obvious imperative. This is the
- 5Measure: persistence rate and benign cleanlinessMemory persistence is largely structural: the recall plus concatenation does most of the
- 6Harden 1: quarantine recalled memory as dataYou proved a planted note re-fires for a different user in a fresh session because recalled
- 7Harden 2: per-user namespacing and provenanceThe quarantine stops a recalled note from being executed, but it does not stop a
- 8Verify and resist: prove the poison dead, real memory intactOne non-fire is not proof, and a fix that kills useful memory is not a fix. With both
Prerequisites
- Comfortable reading Python
- Know what a SQL UPDATE is
- No ML background required
Exam domains covered
Skills & technologies you'll practice
This advanced-level ai/ml lab gives you real-world reps across:
What you'll do in this lab
This is a hands-on offensive-security lab on agent memory poisoning. You attack DV-ToolAgent, a real ReAct tool-using assistant whose long-term memory recalls durable notes into every new session. Acting as ticket-bot, the low-privilege ticket-ingest account, you plant a single routing note into the shared, un-namespaced memory store through an ingested support ticket. The note ties GLOBEX invoices to your attacker payee. Then, in a brand-new session with cleared context and a different legitimate user, a normal GLOBEX invoice question recalls your note and the agent redirects payment to your payee. Persistence across the session boundary is the win, and it is the property single-turn prompt filters cannot catch (OWASP LLM06 Excessive Agency, Agentic ASI06 Memory and Context Poisoning).
You make the attack realistic with MINJA-style progressive shortening: once the agent treats a stored GLOBEX-payee note as a routing rule to apply, you drop the overt instruction language so the residual record reads as a mundane billing preference while it still fires. You measure attack-success-rate over a paced battery against benign controls (a non-GLOBEX invoice), demonstrating AgentPoison's point that a single planted entry yields high targeted ASR with clean benign behavior. Finally you harden the memory layer: per-user namespacing and provenance so a low-privilege caller's note never reaches another user, plus a data-only quarantine so recalled notes are never executed as instructions. You re-run the plant to prove it is dead while legitimate recall still works.