Welcome to NCA-AIIO Labs — Schedule Your First GPU Pod
Hosted
Beta

Welcome to NCA-AIIO Labs — Schedule Your First GPU Pod

Smoke-test your NCA-AIIO lab environment: inspect your isolated Kubernetes cluster, schedule a Pod that requests an NVIDIA GPU, and observe realistic nvidia-smi output. The first 5-minute lab to verify everything works end-to-end.

5 min·1 steps·2 domains·Beginner·nca-aiio

What you'll learn

  1. 1
    Inspect your cluster, schedule a GPU pod
    When you started this lab, the platform provisioned a fresh virtual Kubernetes cluster (vCluster) just for you. You have full cluster-admin inside it — you can create namespaces, ClusterRoles, ResourceQuotas, the works. The cluster sees one Node (the host machine) advertising fake NVIDIA GPUs.

Prerequisites

  • Basic kubectl familiarity (kubectl get, kubectl apply)
  • Read a Pod YAML manifest

Exam domains covered

AI Infrastructure & OperationsGPU Acceleration & Distributed Training

Skills & technologies you'll practice

This beginner-level ai/ml lab gives you real-world reps across:

KuberneteskubectlGPU SchedulingRuntimeClassNCA-AIIOSmoke Test

What you'll do in this NCA-AIIO welcome lab

Every NCA-AIIO scenario starts the same way — kubectl get nodes, confirm GPUs are advertised, schedule a workload with a nvidia.com/gpu: 1 resource request, watch the scheduler bind it. This 5-minute smoke-test gives you the muscle memory and proves your isolated lab environment is functioning. You'll write a minimal Pod manifest, apply it via kubectl, wait for it to run, and read the nvidia-smi output the GPU runtime emits. The cluster is real — your own per-session vCluster — and the GPU resource model is the same Kubernetes API contract you'll see in production NVIDIA infrastructure work.

The substance is small but foundational: runtimeClassName: nvidia to opt into the GPU runtime, resources.limits.nvidia.com/gpu: 1 to request one device, and a container image that has nvidia-smi available. You'll learn to read the nvidia-smi output for what an NCA-AIIO question expects you to extract — driver version, CUDA version, Memory-Usage, GPU-Util, the Processes table — and you'll see the K8s scheduler reject over-requests once you push past your namespace's ResourceQuota. Future labs build directly on top: ResourceQuota authoring, RBAC scoping, GPU Operator chain debugging, MIG profile selection.

Frequently asked questions

Is the GPU output real or simulated?

Simulated. The lab platform uses Run:AI's open-source fake-gpu-operator to advertise nvidia.com/gpu resources without consuming real GPU hardware — that's how we keep per-session cost near zero while letting you exercise the real Kubernetes scheduling, RBAC, ResourceQuota, and GPU Operator workflow. The nvidia-smi output is generated by a fake binary that returns realistic data (driver version, memory, utilization). For NCA-AIIO content this is the right trade-off: the cert tests your ability to author manifests, debug the GPU Operator chain, and reason about scheduling — none of which require actual CUDA execution.

Why do I need runtimeClassName: nvidia in the Pod spec?

RuntimeClass is the K8s API for selecting a non-default container runtime. nvidia is the convention for the NVIDIA Container Toolkit runtime, which mounts the GPU driver libraries and nvidia-smi into the container. Without runtimeClassName: nvidia, the Pod gets the default runtime (runc) and the container has no view of the GPU — even if the request says nvidia.com/gpu: 1. NCA-AIIO Domain 3 (Workloads) specifically tests RuntimeClass knowledge.

What does resources.limits.nvidia.com/gpu: 1 actually do?

It tells the scheduler: 'this Pod needs one device of type nvidia.com/gpu, please find a Node that has at least one available'. The Node advertises capacity via the device plugin (which is what fake-gpu-operator simulates). If you set it to a number larger than the Node has, the Pod stays Pending with 0/1 nodes available: insufficient nvidia.com/gpu. Your namespace's ResourceQuota also caps the total — you can't request more than the quota allows.