📐 Measures of Dispersion

📊 Descriptive Statistics⏱️ 12 min read📅 Last updated: 01/14/2025

Introduction

Measures of dispersion complement measures of central tendency by describing how much the data spread around the mean or median. Two samples can have the same mean, but one can have very concentrated values and the other very spread out — measures of dispersion capture this difference.

Why Measure Dispersion?

In lotteries and data analysis, knowing only the mean is not enough. For example: the mean of drawn numbers can be 25 in two games, but in the first the numbers can range from 1 to 50 (dispersed) and in the second from 20 to 30 (concentrated). Variance, standard deviation, range and interquartile range help quantify this variability.

Variance

Variance is the mean of the squared deviations from the mean. It measures the "spread" of the data: the higher the variance, the more dispersed the values are.

Formulas

Population: σ² = Σ(xᵢ − μ)² / N

Sample: s² = Σ(xᵢ − x̄)² / (n − 1)

μ = population mean, x̄ = sample mean, N = population size, n = sample size. In the sample we use (n−1) to obtain an unbiased estimator.

Standard Deviation

The standard deviation is the square root of the variance. It has the same unit as the data (e.g., numbers from 1 to 60), which makes interpretation easier: "on average, values deviate from the mean by X units".

Population: σ = √σ² | Sample: s = √s²

Range and Interquartile Range

The range is the difference between the largest and smallest value (max − min). It is simple but sensitive to extreme values. The interquartile range (IQR) is the difference between the third and first quartile (Q3 − Q1) and is more stable, as it ignores the smallest 25% and largest 25% of values.

💡 In practice

In lottery analysis, standard deviation and IQR help see if draws are very "regular" or have expected variability. Combining central tendency and dispersion gives a more complete view of the data.

Measures of Dispersion - Statistics | SevenCoins