GPU Environment Smoke Test
GPU sandbox · jupyter
Beta

GPU Environment Smoke Test

Validate the GPU lab environment: terminal, file operations, PyTorch, CUDA, and model loading.

10 min·4 steps·Beginner

What you'll learn

  1. 1
    Verify GPU Access
  2. 2
    Create a Tensor on GPU
  3. 3
    Load a Model with Transformers
  4. 4
    Train a Tiny Model (multi-file import)

What this GPU smoke-test lab verifies

Across four quick steps you'll verify that everything downstream of a Preporato GPU lab is actually wired up correctly. Step 1 calls torch.cuda.is_available(), reads the GPU name with torch.cuda.get_device_name(0), and prints total VRAM in gigabytes. Step 2 allocates two 1000x1000 tensors directly on CUDA, runs a matmul timed with torch.cuda.Event start/end records, repeats the same matmul on CPU, and confirms GPU wins. Step 3 loads GPT-2 through AutoModelForCausalLM.from_pretrained('gpt2'), moves it to CUDA, runs a 50-token generate, and reports VRAM in use. Step 4 imports a helper module from the workspace (proving multi-file Python imports work) and trains a tiny XOR network to >90% accuracy with BCE loss and Adam.

This is a pre-flight check, not a teaching lab. The point is to confirm in under 10 minutes that your browser terminal works, file operations in the workspace work, PyTorch sees the GPU, VRAM is at least 8 GB, and the Hugging Face cache can pull a model. If any step fails, you know the environment is broken before you sink an hour into a real lab and hit the same issue deep inside it. Total walkthrough: roughly 10 minutes, no prerequisites beyond basic Python.

Frequently asked questions

What should I do if Step 1 reports cuda_available = False?

That means the kernel isn't seeing a GPU — almost always a session-level environment issue rather than anything you did wrong. Restart the Jupyter kernel from the top menu and re-run the cell. If it still fails, end the lab session and start a fresh one; the runtime will provision a new GPU pod. If the problem persists across multiple fresh sessions, open a support ticket — that's an infrastructure issue, not a user issue.

Why does the lab check that GPU is faster than CPU in Step 2?

As a sanity check that the tensor actually executed on the device you think it did. A common silent bug is a tensor that ended up on CPU because of an implicit .to('cpu') somewhere; the program runs fine but you lose all the GPU speedup without noticing. The matmul timing comparison makes that visible — if GPU isn't clearly faster than CPU on a 1000x1000 matmul, something is off with how the tensor got dispatched.

Why is VRAM required to be at least 8 GB?

Because every real lab in the Preporato catalog — data prep with BPE training, Qwen2-VL captioning, vLLM serving, fine-tuning with LoRA — assumes at minimum the memory headroom a modern datacenter or midrange workstation GPU provides. 8 GB is the floor that lets GPT-2-small, TinyLlama, BGE, and a forward pass of most vision models coexist with activations and KV cache. If you see less than that, the session spun up on unexpected hardware and the heavier labs will OOM.

What is data.py in Step 4 and why does the lab care about importing it?

It's a tiny helper shipped in the workspace alongside the notebook, exposing a get_xor_data('cuda') function that returns XOR inputs and labels on GPU. The point of the step isn't the XOR task — it's to verify that multi-file Python imports work inside the Jupyter kernel. Several real labs split utility code into .py files and import it from the notebook, so if this step fails, the real labs will fail on their first import cell.

How long does this lab take end-to-end?

About 10 minutes if the environment is healthy, mostly spent on the GPT-2 download in Step 3 (cached after the first run) and the XOR training loop in Step 4. If any step takes significantly longer than a couple of minutes on its own, the session is throttled or the model cache isn't populated — retry in a fresh session.