PriorityClass & Preemption — Who Survives the GPU Squeeze
When GPU capacity is full and a critical training job lands, who wins? This lab builds the mental model behind PriorityClass and preemption — the only mechanism Kubernetes gives you for resolving GPU contention with intent rather than first-come-first-served. Includes the `preemptionPolicy: Never` escape hatch most teams misuse.
What you'll learn
- 1The contention problem — what happens when GPUs run outYou're sharing a 4-GPU cluster with three teams. By 2pm, all 4 GPUs are running long batch jobs. A critical inference deployment ships at 3pm and needs 1 GPU. *What happens?*
- 2Reading PriorityClasses — value, preemptionPolicy, globalDefaultA PriorityClass is a tiny cluster-scoped object — three fields do all the work. Understanding what each one controls is the difference between writing a correct policy and writing one that breaks the wrong way.
- 3Preemption in action — watch the scheduler evict a low-pri podThe cluster is already full. The lab platform pre-deployed 4 low-priority batch pods (batch-1 through batch-4), each holding 1 GPU. The node's 4 GPUs are 100% allocated. Run kubectl get pods to see them.
- 4preemptionPolicy: Never — high priority, polite waitingSometimes a workload is genuinely high-priority but you *don't* want it to evict running pods. Common cases:
- 5Triage Day — three pods broken in priority-related waysThe cluster is full again — 4 low-priority batch-* pods are using all 4 GPUs. Three high-priority pods have just been deployed, but each has a different priority misconfiguration, so all three are Pending instead of Running.
Prerequisites
- Completed `nca-aiio-resource-requests-limits` (or comfortable with QoS classes, ResourceQuota, and pod admission)
- Familiar with `kubectl get pods -w`, `kubectl describe pod`, and reading scheduler events
Exam domains covered
Skills & technologies you'll practice
This intermediate-level ai/ml lab gives you real-world reps across:
What you'll learn
PriorityClass and preemption are how production teams resolve GPU contention at the Kubernetes scheduler level. Without them, the only outcome of a full cluster is a Pending pod queue ordered by submission time — there's no notion of "this training job matters more than that throwaway notebook." This lab teaches the complete model: how Kubernetes uses an integer priority value to rank pods, how the scheduler decides which low-priority pods to evict to make room for a high-priority pod, why preemptionPolicy: Never is what you use when you want priority ranking without disruption, and the failure modes when a pod references a PriorityClass that doesn't exist.
By the end you'll have authored three PriorityClasses (low, medium, high), watched the scheduler preempt a low-priority pod when GPU capacity is exhausted, and diagnosed three pods broken in priority-related ways. This is the daily work of a platform engineer running a multi-tenant GPU cluster, and it's a major topic in the NCA-AIIO exam's AI Infrastructure & Operations domain.
Frequently asked questions
What's the difference between priority and preemption?
What is preemptionPolicy: Never and when should I use it?
preemptionPolicy: Never and when should I use it?preemptionPolicy: Never and a high priority value will be admitted ahead of low-priority pods at the front of the Pending queue, but it will not evict any running lower-priority pod to make room. If the cluster is full, it just waits. Use this for high-priority workloads that are still tolerant of waiting (long batch training that can run overnight) when disruption to lower-priority work would be worse than the wait. Most teams set it incorrectly — they want disruption-free ordering but accidentally turn off the entire preemption mechanism.How does the scheduler pick which lower-priority pod to evict?
system-cluster-critical (2000000000) and system-node-critical (2000001000) are essentially never preempted — those values are above any reasonable user-defined priority and protect things like the kubelet, control-plane components, and DNS.Why would my high-priority GPU pod stay Pending even with preemption enabled?
preemptionPolicy: Never is set on your PriorityClass — you got the priority ordering but explicitly turned off eviction. Diagnose with kubectl describe pod and look at the scheduler events for the explanation.