Sensitive Data Disclosure: Leak Confidential Records from a RAG Assistant
Attack a real Retrieval-Augmented Generation assistant where the system prompt only asks for privacy: a Milvus vector store, NVIDIA embeddings, and a multi-tenant knowledge base. Force disclosure of your own gated billing secret, pull another customer's record across a disabled tenant filter, harvest an accidentally-indexed service key, then ship the real fix with pre-retrieval authorization, corpus hygiene, and output redaction.
Hands-on labs require Pro · $29.99/mo · cancel anytime
What you'll learn
- 1Recon: map the assistant and its retrieved recordsYou are an authenticated Globex customer of DV-RAG-Support, ACME Cloud's
- 2Disclose your own: force your gated billing secretThis is the first of three disclosure classes, the simplest one: your own gated
- 3Cross-tenant: pull another customer's record into your answerThe second disclosure class crosses a trust boundary. You are still the Globex
- 4Credential: harvest an accidentally-indexed service keyThe third disclosure class is a secret that should never have been in the corpus
- 5Measure: disclosure ASR across the three classesYou landed three disclosures by hand: your own gated secret, another tenant's PII,
- 6Harden 1: per-caller retrieval scopeYou proved three leaks. None were stopped by the system prompt, because the prompt
- 7Harden 2: corpus hygiene + output redactionStep 6 scoped retrieval, which closed the cross-tenant leak. Two classes remain:
- 8Verify: re-run all three disclosures, blockedYou shipped two hardening steps: per-caller retrieval scope (Step 6) and corpus
Prerequisites
- Comfortable reading Python
- Basic HTTP and markdown
- No ML background required
Exam domains covered
Skills & technologies you'll practice
This intermediate-level ai/ml lab gives you real-world reps across:
What you'll do in this lab
This is a hands-on offensive-security lab built on a real RAG stack: a Milvus vector store, NVIDIA embeddings, and a multi-tenant knowledge base. You attack DV-RAG-Support as an authenticated customer whose own confidential record is retrieved to answer account questions. The assistant's system prompt carries a soft privacy line asking the model not to reveal confidential identifiers, and you will show why that line is not a control. The sensitive data is already in the retrieved context, and a model echoes retrieved content far more readily than it refuses a flagged field, so a broad or structured request surfaces it.
You drive three disclosures hands-on: forcing your own gated billing secret out of your retrieved record, pulling another customer's record into your answer across a tenant filter a migration left disabled, and harvesting a live service key that was accidentally indexed in a draft runbook. Then you flip to defense and ship the real fix: pre-retrieval authorization scoped to the requesting user, corpus hygiene so secrets are never ingested, and output-side redaction as defense in depth, all without breaking legitimate answers. Maps to OWASP LLM02:2025 Sensitive Information Disclosure and MITRE ATLAS AML.T0057.