You work for the UN. Your brief: predict GDP per capita. So you pull everything you can lay hands on — rule of law, life expectancy, but also UNESCO sites, vowel counts, the Scrabble score of each country’s name. 493 columns in total. For this tutorial we’ve quietly tagged every column as causal, spurious, or incidental. In real life you wouldn’t know. Five chapters. One investigation. The data fights back.
Everything on this site is computed live in your browser from
gdp_spurious_regression_dataset.csv (254 rows × 495 columns) and
codebook.csv, both lifted directly from the
eth-bmai-fs26/coding-exercises Colab notebooks. No pre-baked charts,
no cherry-picked numbers. Pick a feature, toggle a scale, and watch the story rearrange
itself.