Non-Parametric Statistics - Statistics

Introduction

Non-parametric statistics are statistical methods that make no assumptions about the underlying distribution of the data. While parametric methods (such as t-tests and ANOVA) assume that data follow a normal distribution, non-parametric methods are more flexible and can be applied to data with any distribution or even when the distribution is unknown.

What are Non-Parametric Statistics?

Non-parametric statistics are so called because they do not assume specific parameters of the population distribution (such as known mean and variance or normal distribution). They are based on:

Main Characteristics

• Do not assume normal distribution: Do not assume normal distribution: Work with any distribution
• Rank-based: Rank-based: Use the order of data, not absolute values
• Less sensitive to outliers: Less sensitive to outliers: More robust to extreme values
• Work with small samples: Work with small samples: Can be used when n is small
• Work with ordinal data: Work with ordinal data: Do not require continuous numeric data

When to Use Non-Parametric Methods?

Non-parametric methods are preferable when:

✅ Use When:

• Data are not normally distributed
• Sample is very small (n < 30)
• There are extreme outliers
• Data are ordinal or categorical
• Variances are very different
• Distribution is unknown

⚠️ Consider Parametric When:

• Data are normally distributed
• Sample is large (n ≥ 30)
• You need greater statistical power
• Parametric assumptions are met

Main Non-Parametric Tests

There are various non-parametric tests, each suited for different situations:

Mann-Whitney U Test (Wilcoxon U)

Non-parametric equivalent to the t-test for two independent samples. Compares two independent samples when data do not follow a normal distribution.

When to Use

• Compare two independent samples
• Data are not normally distributed
• Small samples or different sizes
• Test if two samples come from the same distribution

Wilcoxon Signed-Rank Test

Non-parametric equivalent to the paired t-test. Used to compare two related samples (before and after measurements, for example).

When to Use

• Compare two dependent/paired samples
• Data are not normally distributed
• Test differences in paired data

Kruskal-Wallis Test

Non-parametric equivalent to one-way ANOVA. Used to compare three or more independent groups when data do not follow a normal distribution.

When to Use

• Compare three or more independent groups
• Alternative to ANOVA when assumptions are not met
• Data are not normally distributed
• Test if at least one group differs from the others

Friedman Test

Non-parametric equivalent to repeated measures ANOVA. Used to compare three or more related groups when data do not follow a normal distribution.

Spearman Correlation

Non-parametric method to measure correlation between two variables. Based on data ranks, detects monotonic (not just linear) relationships.

Advantages over Pearson

• Does not assume normal distribution
• Less sensitive to outliers
• Detects monotonic non-linear relationships
• Works with ordinal data

Comparison: Parametric vs Non-Parametric

Let's compare the main methods:

Comparative Table

Parametric Test	Non-Parametric Test
t-test (2 samples)	Mann-Whitney U
Paired t-test	Wilcoxon Signed-Rank
ANOVA (1 factor)	Kruskal-Wallis
Repeated measures ANOVA	Friedman
Pearson correlation	Spearman correlation

Advantages and Disadvantages

✅ Advantages

• Do not require normal distribution
• More robust to outliers
• Work with small samples
• Can be used with ordinal data
• Fewer statistical assumptions

⚠️ Disadvantages

• Generally have less statistical power
• May waste information from the data
• Interpretation may be less intuitive
• Confidence intervals may be less precise

Applications in Lottery Analysis

Non-parametric methods are useful in lottery data analysis when:

Compare Periods

Use Mann-Whitney U to compare number sums between different periods without assuming normality.

Correlation Analysis

Use Spearman correlation to identify relationships between variables when data are not normally distributed.

Compare Multiple Lotteries

Use Kruskal-Wallis to compare patterns between different lottery types when ANOVA assumptions are not met.

Randomness Validation

Non-parametric tests may be more appropriate when the data distribution is unknown or not normal.

Interpreting Results

The interpretation of non-parametric tests is similar to parametric ones:

Important Points

• p-value: p-value: Same interpretation as parametric tests (p < 0.05 = significant)
• Null hypothesis: Null hypothesis: Generally tests if groups come from the same distribution (not if means are equal)
• Effect size: Effect size: May be less intuitive than in parametric tests
• Confidence intervals: Confidence intervals: May be less precise than parametric methods

Conclusions

Non-parametric statistics are valuable tools when parametric method assumptions are not met. They offer greater flexibility and robustness, especially with non-normal data, small samples, or in the presence of outliers.

In lottery data analysis, non-parametric methods can be especially useful when we are uncertain about the data distribution or when we want more robust analyses that are not affected by extreme values or assumption violations.

💡 Recommendation

When in doubt between parametric and non-parametric methods, start by testing the parametric method assumptions (normality, homogeneity of variances). If assumptions are not met, use non-parametric methods. In many cases, you can apply both and compare results for greater confidence.

📋 Non-Parametric Statistics

Introduction

What are Non-Parametric Statistics?

Main Characteristics

When to Use Non-Parametric Methods?

✅ Use When:

⚠️ Consider Parametric When:

Main Non-Parametric Tests

Mann-Whitney U Test (Wilcoxon U)

When to Use

Wilcoxon Signed-Rank Test

When to Use

Kruskal-Wallis Test

When to Use

Friedman Test

Spearman Correlation

Advantages over Pearson

Comparison: Parametric vs Non-Parametric

Comparative Table

Advantages and Disadvantages

✅ Advantages

⚠️ Disadvantages

Applications in Lottery Analysis

Compare Periods

Correlation Analysis

Compare Multiple Lotteries

Randomness Validation

Interpreting Results

Important Points

Conclusions

💡 Recommendation