📋 Non-Parametric Statistics
Introduction
Non-parametric statistics are statistical methods that make no assumptions about the underlying distribution of the data. While parametric methods (such as t-tests and ANOVA) assume that data follow a normal distribution, non-parametric methods are more flexible and can be applied to data with any distribution or even when the distribution is unknown.
What are Non-Parametric Statistics?
Non-parametric statistics are so called because they do not assume specific parameters of the population distribution (such as known mean and variance or normal distribution). They are based on:
Main Characteristics
- • Do not assume normal distribution: Do not assume normal distribution: Work with any distribution
- • Rank-based: Rank-based: Use the order of data, not absolute values
- • Less sensitive to outliers: Less sensitive to outliers: More robust to extreme values
- • Work with small samples: Work with small samples: Can be used when n is small
- • Work with ordinal data: Work with ordinal data: Do not require continuous numeric data
When to Use Non-Parametric Methods?
Non-parametric methods are preferable when:
✅ Use When:
- • Data are not normally distributed
- • Sample is very small (n < 30)
- • There are extreme outliers
- • Data are ordinal or categorical
- • Variances are very different
- • Distribution is unknown
⚠️ Consider Parametric When:
- • Data are normally distributed
- • Sample is large (n ≥ 30)
- • You need greater statistical power
- • Parametric assumptions are met
Main Non-Parametric Tests
There are various non-parametric tests, each suited for different situations:
Mann-Whitney U Test (Wilcoxon U)
Non-parametric equivalent to the t-test for two independent samples. Compares two independent samples when data do not follow a normal distribution.
When to Use
- • Compare two independent samples
- • Data are not normally distributed
- • Small samples or different sizes
- • Test if two samples come from the same distribution
Wilcoxon Signed-Rank Test
Non-parametric equivalent to the paired t-test. Used to compare two related samples (before and after measurements, for example).
When to Use
- • Compare two dependent/paired samples
- • Data are not normally distributed
- • Test differences in paired data
Kruskal-Wallis Test
Non-parametric equivalent to one-way ANOVA. Used to compare three or more independent groups when data do not follow a normal distribution.
When to Use
- • Compare three or more independent groups
- • Alternative to ANOVA when assumptions are not met
- • Data are not normally distributed
- • Test if at least one group differs from the others
Friedman Test
Non-parametric equivalent to repeated measures ANOVA. Used to compare three or more related groups when data do not follow a normal distribution.
Spearman Correlation
Non-parametric method to measure correlation between two variables. Based on data ranks, detects monotonic (not just linear) relationships.
Advantages over Pearson
- • Does not assume normal distribution
- • Less sensitive to outliers
- • Detects monotonic non-linear relationships
- • Works with ordinal data
Comparison: Parametric vs Non-Parametric
Let's compare the main methods:
Comparative Table
| Parametric Test | Non-Parametric Test |
|---|---|
| t-test (2 samples) | Mann-Whitney U |
| Paired t-test | Wilcoxon Signed-Rank |
| ANOVA (1 factor) | Kruskal-Wallis |
| Repeated measures ANOVA | Friedman |
| Pearson correlation | Spearman correlation |
Advantages and Disadvantages
✅ Advantages
- • Do not require normal distribution
- • More robust to outliers
- • Work with small samples
- • Can be used with ordinal data
- • Fewer statistical assumptions
⚠️ Disadvantages
- • Generally have less statistical power
- • May waste information from the data
- • Interpretation may be less intuitive
- • Confidence intervals may be less precise
Applications in Lottery Analysis
Non-parametric methods are useful in lottery data analysis when:
Compare Periods
Use Mann-Whitney U to compare number sums between different periods without assuming normality.
Correlation Analysis
Use Spearman correlation to identify relationships between variables when data are not normally distributed.
Compare Multiple Lotteries
Use Kruskal-Wallis to compare patterns between different lottery types when ANOVA assumptions are not met.
Randomness Validation
Non-parametric tests may be more appropriate when the data distribution is unknown or not normal.
Interpreting Results
The interpretation of non-parametric tests is similar to parametric ones:
Important Points
- • p-value: p-value: Same interpretation as parametric tests (p < 0.05 = significant)
- • Null hypothesis: Null hypothesis: Generally tests if groups come from the same distribution (not if means are equal)
- • Effect size: Effect size: May be less intuitive than in parametric tests
- • Confidence intervals: Confidence intervals: May be less precise than parametric methods
Conclusions
Non-parametric statistics are valuable tools when parametric method assumptions are not met. They offer greater flexibility and robustness, especially with non-normal data, small samples, or in the presence of outliers.
In lottery data analysis, non-parametric methods can be especially useful when we are uncertain about the data distribution or when we want more robust analyses that are not affected by extreme values or assumption violations.
💡 Recommendation
When in doubt between parametric and non-parametric methods, start by testing the parametric method assumptions (normality, homogeneity of variances). If assumptions are not met, use non-parametric methods. In many cases, you can apply both and compare results for greater confidence.