AI / ML labs

Beta

Hands-on labs for LLMs, RAG & agents. Real GPUs.

Fine-tune LLMs with LoRA, ship RAG pipelines on NVIDIA NIM, build agentic systems, and profile CUDA — on real GPU sandboxes and hosted environments. No setup, no simulators.

75 labs·31 on real GPUs · 44 hosted·9 NVIDIA certs covered

Start here

ncp-aai · react-agent-nim

Free★ Staff pick · Hosted

Build a ReAct Agent with NVIDIA NIM

Build a working AI research librarian — an agent that can search a corpus of ML papers, read abstracts, compare methods, and reason over them to answer multi-step questions. Uses LangChain, LangGraph, and NVIDIA NeMo Agent Toolkit on real NIM endpoints.

POST /api/agent/invoke

tool_calls ....... 3

grade ............ pass

This week

Trending

Build an MCP Tool Server & Connect a LangChain Agent

Build a Model Context Protocol server that exposes your company's tools and data — then connect a LangChain agent to it. Learn how MCP decouples tools from agents, when to use MCP vs Anthropic Skills vs native @tool, and why MCP is the emerging standard for AI tool interop.

Advanced·40 min·6 steps

Most completed

Popular

Build a RAG Pipeline with NVIDIA NIM

Build a complete Retrieval Augmented Generation pipeline — from document chunking to vector search to an agent that answers questions from your knowledge base.

7 steps·35 min·intermediate

75 labs · grouped by topic

Agentic AI

LangGraph, NeMo Agent Toolkit, MCP, A2A, RAG agents

22 labs

HOSTED Pro

ncp-aai · a2a-communicationAdvanced

Build Two Agents That Talk via the A2A Protocol

Build two independent agents that talk to each other via the A2A protocol — each owned by a different team, running in its own process, discovered through a standardized AgentCard. Learn how A2A differs from multi-agent orchestration and when each architecture fits.

Hands-on labs for LLMs, RAG & agents. Real GPUs.

Start here

Build a ReAct Agent with NVIDIA NIM

Build an MCP Tool Server & Connect a LangChain Agent

Build a RAG Pipeline with NVIDIA NIM

Agentic AI

Build Two Agents That Talk via the A2A Protocol

Add Long-Term Memory to an AI Agent: LangGraph + Milvus

Build an AI Agent 3 Ways: ReAct vs Tool Calling vs Plan-and-Execute

Build an MCP Tool Server & Connect a LangChain Agent

Build a Multi-Agent Supervisor with LangGraph

Build a RAG Pipeline with NVIDIA NIM

Build a ReAct Agent with NVIDIA NIM

Build NeMo Guardrails for an AI Agent: Jailbreak & Topical Rails

Evaluate an Agent with LLM-as-Judge

Model Routing & Cost Cascade with NIM

Structured Output & Function Calling with NIM

Visual Q&A with NVIDIA VLMs

Multimodal RAG with NeMo Retriever

Insecure Output Handling: SSRF, SQLi, and Command Execution Through an Agent's Tools

Inter-Agent Injection: Propagate a Morris II Worm Across a Two-Agent Graph

MCP Tool Poisoning: Hijack an Agent Through a Tool Description (and a Rug Pull)

Memory Poisoning: Plant a Note That Re-Fires in a Fresh Session (Persistence)

Excessive Agency: Turn a Support Ticket into a Privileged Action (Confused Deputy)

Tool Shadowing: Hijack an Agent's Tool Selection With a Name Collision

Defend a RAG Assistant: Build a Guardrail Layer and an Attack-Success-Rate CI Gate

Defend Excessive Agency: Re-scope a Tool Agent to Least Privilege (AuthZ + Human Approval Gate)

Defend the Agent Supply Chain: Verify, Pin, and Capability-Gate Your Tool Registry

LLM serving & inference

Deploy & Serve LLMs in Production (Jupyter)

Inference Serving Patterns: Dynamic Batching, Throughput, and the Triton Mental Model

Batch Size & Precision Sweep: Finding Your Sweet Spot

vLLM Production Serving: PagedAttention, Continuous Batching, Prefix Caching

Fine-tuning & alignment

Fine-Tune an LLM with LoRA and QLoRA (Jupyter)

Quantize & Optimize LLMs with bitsandbytes

RLHF & DPO Alignment

Fine-Tune Stable Diffusion with LoRA: Custom Text-to-Image

RAG & retrieval

Advanced RAG: Hybrid Search + Cross-Encoder Reranking

Retrieval-Augmented Generation (RAG) Pipeline with Local Models

Persistent Storage for AI Workloads — PVCs, StorageClass & the Checkpoint Pattern

Indirect Prompt Injection: Exfiltrate Data from a RAG Assistant

Cross-Tenant Leakage: Break RAG Metadata Isolation and Exfiltrate Another Tenant's Contract

Retrieval Poisoning: Win Top-k Across a Whole Query Class and Steer the Answer

Recon and Harness: Map a RAG Attack Surface and Measure Attack-Success-Rate

System-Prompt Extraction: Recover a RAG Assistant's Hidden Instructions

Insecure Output Handling: Zero-Click Exfiltration Through Rendered Model Output (EchoLeak)

Sensitive Data Disclosure: Leak Confidential Records from a RAG Assistant

Defense in Depth: Wire Four Control Points Around a RAG Assistant

Defend a RAG Assistant: Block Indirect-Injection Exfil (EchoLeak)

Build a RAG Firewall: Reject Poisoned Ingestion and Enforce Tenant Isolation

Defend: Secret Isolation for a RAG Assistant

Training & pretraining

Continued Pre-Training: Adapt a Pretrained LM to a New Domain

Train a Small Language Model from Scratch

Build a Transformer from Scratch: Attention, Masking & LayerNorm

CUDA & kernel optimization

CUDA Programming Fundamentals

GPU Sharing: Streams, MPS, MIG, and the Real Cost of Contention

Nsight Systems Profiling: Finding the Bottleneck That Costs You 40% of Your GPU

Profiling & performance

Profile PyTorch Training with the Built-in Profiler

GPU Cost & Efficiency Audit

Multimodal

Vision-Language Models: Captioning and Visual QA

Data & pipelines

NVIDIA DALI: GPU-Accelerated Data Pipelines

Data Preparation for LLM Training

Synthetic Data Generation for Model Training

GPU infrastructure

Welcome to NCA-AIIO Labs — Schedule Your First GPU Pod

Kubernetes Resource Requests & Limits — Who Gets What, and Who Survives

Inside the NVIDIA GPU Operator — From Helm to Workload-Ready

GPU Container Lifecycle: Build, Test, Ship, Rollback

GPU Health Checks + Auto-Remediation

MLflow Experiment Tracking: From Single Run to Team Workflow

NVIDIA GPU Operator on k3s: Single-Node Kubernetes for GPU Workloads

PriorityClass & Preemption — Who Survives the GPU Squeeze

Workload Controllers — Deployment, StatefulSet, DaemonSet for AI