Question 1

How much speedup should I actually expect from cuDF vs pandas?

Accepted Answer

Workload-dependent, but on a groupby over a million-row table you typically see 10-100x. The gap widens with larger tables and aggregations that benefit from GPU parallelism (joins, wide groupbys, string operations) and narrows on small data where PCIe transfer cost dominates or on operations cuDF hasn't specialized. The lab prints your measured speedup at the end of Step 1 — treat it as a datapoint, not a guarantee, and always benchmark on your own workload before claiming a production win.

Question 2

Can I swap cuDF for Polars — they're both faster than pandas?

Accepted Answer

Different tools for different problems. Polars is a CPU-multithreaded DataFrame library with a lazy query planner — brilliant for wrangling data that fits in RAM on a beefy CPU box. cuDF is a GPU DataFrame that compiles a subset of pandas operations onto CUDA — the advantage is decisive when your data lives next to a GPU model and you want zero-copy hand-off to cuML, cuGraph, or PyTorch. If your pipeline ends at a CSV, Polars often wins. If it ends at a GPU model, cuDF keeps the data on the device where it belongs.

Question 3

Why does the lab care so much about avoiding `.to_pandas()` mid-pipeline?

Accepted Answer

Because every round-trip is a PCIe copy — gigabytes moving from GPU VRAM to system RAM and back. It's the single most common way RAPIDS pipelines silently lose their speedup. Step 4's grader checks `isinstance(result_df, cudf.DataFrame)` specifically because a pipeline that runs filter on GPU → `.to_pandas()` for feature engineering → back to GPU for prediction will produce correct numbers and terrible throughput. Once the data is on the device, keep it there until the very end.

Question 4

Why does cuML RandomForest's accuracy differ slightly from sklearn's?

Accepted Answer

Same algorithm, different implementation — cuML builds trees with a histogram-based GPU algorithm that's numerically close to but not identical to sklearn's exact split-finding. On clean synthetic data they match within a percent or two; on messy real data they can diverge more. The 15-point bound in Step 3 is a sanity check that you're not comparing apples to oranges (wrong features, wrong train/test split) rather than an expectation of exact parity.

Question 5

The model predicted all zeros on the refund data — is it broken?

Accepted Answer

Probably not. With a ~3% positive rate and the default 0.5 threshold, predicting the majority class is the Bayes-optimal constant predictor for maximizing accuracy — the model hasn't failed, the metric has. Fix it at the pipeline level: tune the decision threshold, use `class_weight='balanced'`, oversample the minority class with SMOTE, or switch the primary metric to PR-AUC or recall@k. The reflection step pushes you to name the right fix rather than staring at the accuracy number.

Question 6

Do I need a specific GPU to run this lab?

Accepted Answer

No — the lab provisions a real NVIDIA GPU pod on demand with RAPIDS preinstalled against a compatible CUDA version. You only need a browser. For your own machine, RAPIDS documents the supported SKUs and CUDA matrix — most Pascal-or-newer consumer and datacenter cards work, and the RAPIDS install instructions pin the right conda/pip channel for your driver version.

GPU-Accelerated Data Science with RAPIDS

What you'll learn

Prerequisites

Exam domains covered

Skills & technologies you'll practice

What you'll build in this RAPIDS / cuDF / cuML lab

Frequently asked questions

How much speedup should I actually expect from cuDF vs pandas?

Can I swap cuDF for Polars — they're both faster than pandas?

Why does the lab care so much about avoiding `.to_pandas()` mid-pipeline?

Why does cuML RandomForest's accuracy differ slightly from sklearn's?

The model predicted all zeros on the refund data — is it broken?

Do I need a specific GPU to run this lab?

GPU-Accelerated Data Science with RAPIDS

What you'll learn

Prerequisites

Exam domains covered

Skills & technologies you'll practice

What you'll build in this RAPIDS / cuDF / cuML lab

Frequently asked questions

How much speedup should I actually expect from cuDF vs pandas?

Can I swap cuDF for Polars — they're both faster than pandas?

Why does the lab care so much about avoiding .to_pandas() mid-pipeline?

Why does cuML RandomForest's accuracy differ slightly from sklearn's?

The model predicted all zeros on the refund data — is it broken?

Do I need a specific GPU to run this lab?

Why does the lab care so much about avoiding `.to_pandas()` mid-pipeline?