Home/Blog/Standard Deviation: The Most Misunderstood Number in Statistics
Probability Deep Dive8 min read

Standard Deviation: The Most Misunderstood Number in Statistics

Standard deviation is not just a formula — it is a measure of how wrong your average is. Understanding it changes how you read every statistic you encounter.

P
The Probability Lab Team
July 6, 2025

The mean is the most natural summary of a dataset. But the mean alone is almost always misleading. Two datasets can have identical means and wildly different behavior. This is where standard deviation earns its place as the second most important number in any statistical analysis.

What variance measures — and why we take the square root

Variance is the average squared distance from the mean. We square the distances for two reasons: to make all deviations positive, and to penalize large deviations more heavily than small ones. A deviation of 10 contributes 100 to the variance; a deviation of 1 contributes only 1. Squaring amplifies outliers.

Variance and Standard Deviation
Population variance:  σ² = Σ(xᵢ − μ)² / N
Sample variance:       s² = Σ(xᵢ − x̄)² / (n − 1)
Standard deviation:    σ  = √σ²

Why n−1 for samples? Bessel's correction: the sample mean is itself
estimated from the data, creating one degree of constraint.
Using n would systematically underestimate the population variance.

Taking the square root returns variance to the original unit of measurement. If you measured heights in centimetres, variance is in cm² — a meaningless unit. Standard deviation is in centimetres, directly interpretable.

The 68-95-99.7 rule

For any normally distributed variable, the standard deviation determines the probability of observations falling within certain ranges:

Range% of observationsInterpretation
μ ± 1σ68.27%About two-thirds of all data
μ ± 2σ95.45%The "standard" confidence band
μ ± 3σ99.73%Events outside this are rare (≈1 in 370)
μ ± 6σ99.9999998%The target of Six Sigma manufacturing

When standard deviation misleads you

Standard deviation assumes the data is roughly symmetric and has no extreme outliers. When these assumptions break down, σ becomes a poor summary. A distribution with fat tails — financial returns, earthquake magnitudes, internet traffic — has standard deviation that dramatically understates real risk.

Long-Term Capital Management used standard deviation to model bond portfolio risk. In 1998, a "10σ event" occurred — one the model predicted would happen once in the history of the universe. The fund lost $4.6 billion in six weeks and required a Federal Reserve bailout.

The lesson is not that standard deviation is wrong — it is that it was applied to a distribution where its assumptions did not hold. Knowing when σ is the right tool requires understanding both what it measures and what it ignores.

Coefficient of variation: comparing across scales

Coefficient of Variation
CV = (σ / μ) × 100%

Example: Two manufacturing processes, both targeting a diameter of 10mm.
Process A: σ = 0.1mm  →  CV = 1%
Process B: σ = 0.5mm  →  CV = 5%

Process B has five times more relative variability.
The CV is scale-free — useful for comparing very different measurements.

Our Roulette Simulator lets you observe standard deviation live. Track the frequency of a single number over many spins. You will see the count oscillate around the expected value of spins/37 (European) with variance shrinking — relative to the mean — as the spin count grows. The standard deviation of your observed frequency is doing exactly what the formula predicts.