01 · The Bait — what predicts GDP?

Correlations with \(\log(\text{GDP})\), stratified by role

All 493 features plotted on the same \(|r|\)-axis, one horizontal band per role. The causal and incidental swarms bleed into each other; the spurious swarm reaches surprisingly far to the right — its strongest feature outranks roughly half of the causal ones. Toggle below to switch back to the ranked-bar views if you want the old behaviour.

Scatter explorer

Pick a feature and see it plotted against GDP per capita. Points are coloured by role, so you can watch a convincing-looking line emerge from a spurious column and a messy cloud appear around a causal one. Toggle log scales when the distribution is skewed.

feature: \(\log_{10} x\) \(\log_{10} y\) (GDP)

takeaway

Correlation never told you what you thought it was telling you. Incidental columns match causal ones shoulder for shoulder. The strongest of 333 spurious features still lands at \(|r| \approx 0.47\) — not for any reason, but because hundreds of random shots at a target will put a few near the bullseye. Next chapter: you stop looking and start fitting. It goes poorly.

What predicts GDP?

Featured examples

Correlations with \(\log(\text{GDP})\), stratified by role

Scatter explorer

Guess the role