Kubernetes Resource Requests & Limits — Who Gets What, and Who Survives
Master the most consequential six lines in any Kubernetes manifest: requests, limits, and how they decide scheduling, throttling, eviction, and survival under pressure. Includes the CFS throttling controversy and what 2026 production teams actually do with CPU limits.
What you'll learn
- 1Admission — the rules before schedulingImagine you and four teammates share a single Kubernetes cluster — say, 16 CPU cores and 64 GiB of RAM total, with two GPUs. Everyone deploys whatever they want. Whose pods get the resources? Whose get killed when memory runs out?
- 2Requests — what the scheduler reservesYou've now seen that LimitRange and ResourceQuota gate-keep at the namespace boundary. Once a pod passes those, the scheduler takes over. And the scheduler looks at exactly one thing: requests.
- 3Limits — and the CPU controversyRequests are about the *future* — what will be reserved for you. Limits are about the *present* — what the kubelet (the agent on the node) will let your container actually consume right now.
- 4QoS classes — who survives node pressureKubernetes classifies every pod into one of three QoS classes (Quality of Service) the moment it's admitted. The classification is purely a function of how requests and limits are set:
- 5GPU resources — the rules that surprise everyoneNow we zoom into nvidia.com/gpu. It's not measured in millicores or mebibytes — it's a count of whole devices. And it has rules that surprise everyone the first time:
- 6Putting it all together — a production-shape GPU podYou've now seen every concept the lab covers. Combine them into a single production-shape pod manifest — the kind you'd actually submit for a real GPU workload.
- 7Triage — fix what's broken in your clusterYou've now learned the mental model. Time to use it. Three pods have just been deployed into your cluster — every one of them is broken in a different way that the previous six steps taught you to recognize. Your job: find what's wrong with each, and fix it so all three are Running.
Prerequisites
- Completed the welcome smoke-test lab
- Comfortable reading and editing YAML
- Familiar with kubectl get, apply, describe, logs
Exam domains covered
Skills & technologies you'll practice
This intermediate-level ai/ml lab gives you real-world reps across:
What you'll learn
Resource requests and limits are the most consequential six lines in any Kubernetes manifest, but most engineers learn them by copy-pasting examples until something works. This lab teaches the why behind each line, the precise difference between request and limit, why CPU and memory enforcement are fundamentally different, and what actually happens during a node pressure event. You'll watch the kernel OOM-kill a container that exceeds its memory limit, watch CFS bandwidth control throttle a CPU-bound process even when the node is idle, and see Kubernetes reject pods at admission for exceeding the namespace ResourceQuota.
The lab also covers what most courses skip: the real-world controversy around CPU limits. Modern production teams routinely omit CPU limits on latency-sensitive workloads because the CFS bandwidth control mechanism causes throttling even on otherwise-idle nodes. You'll see the throttling happen, then learn the mental model — requests-only for latency-critical services, requests-and-limits for batch and untrusted workloads, always-set memory limits because OOM has no equivalent of throttling. By the end, you'll be able to read any pod manifest and immediately spot what failure modes its author has and hasn't defended against.