Summary of Statistical Tests
by Philip Guo (philip@pgbovine.net)
As I attempt to learn statistics to assist my research, I've found it useful to document the various tests that I've learned. My goal in creating this page is to provide a quick summary of what each test can be used for and when it can be applied. I'm not a statistics expert by any means, so please email me if there are errors.
Testing whether the population mean is equal to some value
You collect a simple random sample from a population of ratio-level or interval-level values.
Any population distribution (more realistic scenario)
You must have large sample size (N >= 30) to proceed. You can use the z-test to get an approximate solution.
Normally-distributed population
- Population standard deviation unknown
- Large sample size (N >= 30)
- z-test (approximate but easier to calculate)
- t-test (exact but might be harder to calculate using tables)
- Small sample size (N < 30)
- t-test (exact)
- Large sample size (N >= 30)
- Population standard deviation known (unlikely)
- z-test (exact) - this is the ideal case but unlikely to actually occur
Testing whether the means of two populations are identical, given two sets of independent samples
You collect two sets of independent simple random samples from populations of ratio-level or interval-level values (e.g., subjects are randomly-picked to receive either one of two experimental treatments).
Any population distribution (more realistic scenario)
- Large sample size (N >= 30)
- two-sample z-test
- Mann-Whitney U test a.k.a. Wilcoxon rank-sum test (non-parametric, and will even work on ordinal-level measurements) - tests whether two samples are drawn from the same population (and, by implication, their distributions and means are equal)
- Small sample size (N < 30)
- Mann-Whitney U test a.k.a. Wilcoxon rank-sum test
Normally-distributed population
- Population standard deviations unknown but assumed to be equal
(called homogeneity of variances)
- Large sample size (N1 >= 30, N2 >= 30)
- two-sample z-test
- Small sample size (N1 < 30, N2 < 30)
- two-sample t-test
- Large sample size (N1 >= 30, N2 >= 30)
- Population standard deviations known (unlikely)
- two-sample z-test
Testing whether one population has values that are consistently greater than or less than those of the other population, given two sets of paired samples
You collect a set of paired samples from two populations of ratio-level or interval-level values (e.g., each subject is given both experimental treatments).
Any population distribution (more realistic scenario)
The following non-parametric tests can even be used for ordinal-level values.
Wilcoxon signed-rank test - tests whether the median difference between pairs of observations is zero
Sign test - tests whether there are equal numbers of pairs of observations that exhibit increases and decreases in value
The population of differences between pairs are normally-distributed
That's right, you read that correctly! The following test assumes that the differences in the pairs amongst the population are normally-distributed, which might make it difficult to apply.
- paired t-test - tests whether the mean difference between pairs of observations is zero
Testing whether the means of more than two populations are identical
You collect the same sets of ratio-level or interval-level measurements from several different groups (e.g., people's heights) and want to determine whether the means of all of the groups' respective populations are identical.
Any population distribution (more realistic scenario)
The following non-parametric test can even be used for ordinal-level values, but it assumes that the observations in each group come from distributions with the same shape (less stringent than homoscedasticity, though):
- Kruskal-Wallis test - tests whether the mean ranks of samples are identical (not exactly the same as testing whether the means are identical)
Normally-distributed and homoscedastic population
Homoscedastic means that the within-group variances for all groups are identical (e.g., the variance in heights within each group of people are identical).
- One-way anova - tests whether the means of all populations are identical
Testing whether two variables are correlated
You collect pairs of ratio-level or interval-level measurements from a population, where each of the two elements in each pair measures a different property (e.g., height and weight).
- Pearson correlation test - tests the degree of linear correlation
- Spearman rank correlation test - (non-parametric, and will even work on ordinal-level measurements) - tests the degree of (not necessarily linear) correlation
Testing whether the observed frequencies of categorical (nominal) variables deviate significantly from their expected frequencies
The following are goodness-of-fit tests on counts (frequencies) of observations of categorical (a.k.a. nominal) variables.
These tests can be computationally expensive, so they are not recommended for N > 1000:
Exact binomial test - can only be used to test the frequencies of two categorical values such as male vs. female (use exact multinomial test for > 2 values)
Randomization test - should give the same result as the exact test if run enough times, but is intuitively easier to explain
These tests require that the expected counts in each category not be too small (the smallest expected count greater than 5 will suffice):
- Pearson's chi-square test - can be used to test the frequencies of two or more categorical values
Testing whether the proportions in two different groups are identical
You have two categorical variables, each of which have two or more possible values.
- Chi-square test of independence - doesn't work well if the smallest expected count is too small, say less than 5
- Fisher's exact test of independence
Sources
- The Cartoon Guide to Statistics by Larry Gonick and Wollcott Smith
- Schaum's Outline of Elements of Statistics II: Inferential Statistics by Stephen Bernstein and Ruth Bernstein
- HyperStat Online
- Handbook of Biological Statistics
- Choosing a statistical test - (way better than my lame attempt here!)
Last modified: 2008-03-30
