Probability Deep Dive8 min read

Standard Deviation: The Most Misunderstood Number in Statistics

Standard deviation is not just a formula, it is a measure of how wrong your average is. Understanding it changes how you read every statistic you encounter.

The Probability Lab Team

July 6, 2025

Prefer to explore this visually? Visualise Standard Deviation Interactively →

The mean is the most natural summary of a dataset. But the mean alone is almost always misleading. Two datasets can have identical means and wildly different behavior. This is where standard deviation earns its place as the second most important number in any statistical analysis.

What variance measures, and why we take the square root

Variance is the average squared distance from the mean. We square the distances for two reasons: to make all deviations positive, and to penalize large deviations more heavily than small ones. A deviation of 10 contributes 100 to the variance; a deviation of 1 contributes only 1. Squaring amplifies outliers.

Variance and Standard Deviation

Population variance:  σ² = Σ(xᵢ − μ)² / N
Sample variance:       s² = Σ(xᵢ − x̄)² / (n − 1)
Standard deviation:    σ  = √σ²

Why n−1 for samples? Bessel's correction: the sample mean is itself
estimated from the data, creating one degree of constraint.
Using n would systematically underestimate the population variance.

Taking the square root returns variance to the original unit of measurement. If you measured heights in centimetres, variance is in cm², a meaningless unit. Standard deviation is in centimetres, directly interpretable.

The 68-95-99.7 rule

For any normally distributed variable, the standard deviation determines the probability of observations falling within certain ranges:

Range	% of observations	Interpretation
μ ± 1σ	68.27%	About two-thirds of all data
μ ± 2σ	95.45%	The "standard" confidence band
μ ± 3σ	99.73%	Events outside this are rare (≈1 in 370)
μ ± 6σ	99.9999998%	The target of Six Sigma manufacturing

When standard deviation misleads you

Standard deviation assumes the data is roughly symmetric and has no extreme outliers. When these assumptions break down, σ becomes a poor summary. A distribution with fat tails, financial returns, earthquake magnitudes, internet traffic, has standard deviation that dramatically understates real risk.

Long-Term Capital Management used standard deviation to model bond portfolio risk. In 1998, a "10σ event" occurred, one the model predicted would happen once in the history of the universe. The fund lost $4.6 billion in six weeks and required a Federal Reserve bailout.

The lesson is not that standard deviation is wrong, it is that it was applied to a distribution where its assumptions did not hold. Knowing when σ is the right tool requires understanding both what it measures and what it ignores.

Coefficient of variation: comparing across scales

Coefficient of Variation

CV = (σ / μ) × 100%

Example: Two manufacturing processes, both targeting a diameter of 10mm.
Process A: σ = 0.1mm  →  CV = 1%
Process B: σ = 0.5mm  →  CV = 5%

Process B has five times more relative variability.
The CV is scale-free, useful for comparing very different measurements.

Our Roulette Simulator lets you observe standard deviation live. Track the frequency of a single number over many spins. You will see the count oscillate around the expected value of spins/37 (European) with variance shrinking, relative to the mean, as the spin count grows. The standard deviation of your observed frequency is doing exactly what the formula predicts.

Interactive Tool

Put the theory into practice

Visualise Standard Deviation Interactively →

Share𝕏 Share

← Back to all articles

Continue Reading

Probability Deep Dive

The House Edge: What 2.70% Actually Means Over Time

8 min read Probability Deep Dive

The Birthday Paradox: Why 23 People Is All It Takes

7 min read Probability Deep Dive

The Law of Large Numbers: Why More Flips Always Wins

6 min read