Build a RAG Firewall: Reject Poisoned Ingestion and Enforce Tenant Isolation
Hosted · ide
Beta

Build a RAG Firewall: Reject Poisoned Ingestion and Enforce Tenant Isolation

Defend the same multi-tenant RAG assistant the offensive labs attack, in small sequential steps. Stand up the pipeline and trace one benign request, then reproduce two handed-to-you exploits one at a time: a poisoned document that wins retrieval and steers the answer, and a caller-controlled tenant scope that reads another tenant's confidential contract. Watch a naive deny-list get bypassed by a fresh payload, then build the durable control one mechanism per step: a server-side tenant predicate the caller cannot widen, then an ingestion screen that rejects directive-shaped documents before indexing. Verify both exploits are blocked with benign traffic intact, then prove fresh, paraphrased, and renamed variants are all blocked on a real Milvus + NVIDIA embeddings stack.

80 min8 steps4 domainsAdvanced

Hands-on labs require Pro · $29.99/mo · cancel anytime

Map the attack surface
Query
Retriever
LLM
Poisoned doc
retrieved chunk
Answer
0%
Attack-success rate
Attacks blocked · benign answers pass
graded on real output, not the model's talk

What you'll learn

  1. 1
    Stand up DV-RAG and trace one benign request
    You are the defender on DV-RAG-Support, ACME Cloud's multi-tenant
  2. 2
    Reproduce attack A: poisoned ingestion steers the answer
    You have two working exploits in hand. Reproduce them one at a time so you know
  3. 3
    Reproduce attack B: a widened tenant scope reads another tenant
    Now reproduce the second exploit, an isolation failure that does not need any
  4. 4
    Watch a naive deny-list get bypassed by a fresh payload
    After the incident, the team shipped the obvious fix and called it done. This step
  5. 5
    Control 1: a server-side tenant predicate the caller cannot widen
    Time to build the durable control. It has two mechanisms, one per attack surface,
  6. 6
    Control 2: an ingestion screen that rejects poison before indexing
    The tenant predicate from Step 5 is carried forward in dvrag.py. Now build the
  7. 7
    Verify: both exploits blocked, benign traffic intact
    Both mechanisms are now in place: the ingestion screen in firewall.py and the
  8. 8
    Resist bypass: fresh, paraphrased, and renamed attacks all blocked
    A control that only stops the one payload you tested is the deny-list mistake all

Prerequisites

  • Comfortable reading and editing Python
  • Know what a markdown image and an HTTP GET are
  • Familiarity with retrieval poisoning and broken access control helps but is not required

Exam domains covered

Defensive AI SecurityLLM Application SecurityRetrieval-Augmented GenerationMulti-Tenant Isolation

Skills & technologies you'll practice

This advanced-level ai/ml lab gives you real-world reps across:

RAGRAG FirewallIngestion ValidationPrompt Injection DefenseMulti-Tenant IsolationBroken Access ControlOWASP LLM01OWASP LLM02Defensive AI SecurityAI Red Team

What you'll do in this lab

This is a hands-on defensive-security lab on hardening a real Retrieval-Augmented Generation pipeline: a Milvus vector store, NVIDIA nv-embedqa-e5-v5 embeddings, chunking, and tenant-filtered top-k retrieval feeding a live LLM. You are handed two working exploits against DV-RAG-Support and your job is to build a RAG firewall that stops them. First you reproduce the baseline so you understand exactly what fires: a planted document that wins retrieval and steers the answer, and a widened tenant scope that reads another tenant's confidential contract.

Then you watch the obvious fix fail. A deny-list of the strings from the incident report is bypassed by a fresh canary and a hostname sink, the same way a blocklisted tenant value is bypassed by a metadata-filter-injection string. With that lesson in hand you build the durable control: an ingestion firewall that screens documents for the injection pattern (a self-declared answer policy, an imperative aimed at the model, a planted output sink) and rejects them before they reach the index, plus a retrieval predicate that derives the tenant scope server-side from the authenticated session and validates every tenant value against an allow-list. You finish by proving fresh and reworded exploits are both blocked while legitimate same-tenant answers still work.

Frequently asked questions

What will I actually build?

Two enforced controls on a real RAG service. An ingestion firewall (firewall.py) that rejects directive-shaped documents at index build time, so a poisoned document never enters the vector store, and a server-side tenant predicate in retrieve() that derives the scope from the authenticated session and validates tenant values against an allow-list, so a caller cannot widen access to another tenant.

Why is a deny-list not enough?

A deny-list of known-bad strings remembers one example payload while ignoring the injection pattern itself. The lab proves this: a fresh canary and a hostname sink sail past the literal deny-list, and a metadata-filter-injection value slips past a blocklisted tenant name. You replace both with controls that screen for the structure of the attack across payloads.

How does the lab confirm my control holds without breaking the assistant?

The grader plants a fresh poison every run, including a paraphrased variant with a different wording, and replays every cross-tenant scope payload, so you cannot pass by deleting an artifact or by tuning to one phrasing. It also checks that legitimate documents still index and that a benign same-tenant question still returns a real answer, so an over-blocking fix fails too.