📐 Measures of Dispersion
Introduction
Measures of dispersion complement measures of central tendency by describing how much the data spread around the mean or median. Two samples can have the same mean, but one can have very concentrated values and the other very spread out — measures of dispersion capture this difference.
Why Measure Dispersion?
In lotteries and data analysis, knowing only the mean is not enough. For example: the mean of drawn numbers can be 25 in two games, but in the first the numbers can range from 1 to 50 (dispersed) and in the second from 20 to 30 (concentrated). Variance, standard deviation, range and interquartile range help quantify this variability.
Variance
Variance is the mean of the squared deviations from the mean. It measures the "spread" of the data: the higher the variance, the more dispersed the values are.
Formulas
Population: σ² = Σ(xᵢ − μ)² / N
Sample: s² = Σ(xᵢ − x̄)² / (n − 1)
μ = population mean, x̄ = sample mean, N = population size, n = sample size. In the sample we use (n−1) to obtain an unbiased estimator.
Standard Deviation
The standard deviation is the square root of the variance. It has the same unit as the data (e.g., numbers from 1 to 60), which makes interpretation easier: "on average, values deviate from the mean by X units".
Population: σ = √σ² | Sample: s = √s²
Range and Interquartile Range
The range is the difference between the largest and smallest value (max − min). It is simple but sensitive to extreme values. The interquartile range (IQR) is the difference between the third and first quartile (Q3 − Q1) and is more stable, as it ignores the smallest 25% and largest 25% of values.
💡 In practice
In lottery analysis, standard deviation and IQR help see if draws are very "regular" or have expected variability. Combining central tendency and dispersion gives a more complete view of the data.