If the assumption of normality is violated, or outliers are present, then the one-sample t test may not be the most powerful test available, and this could mean the difference between detecting a true difference or not. A nonparametric test or employing a transformation may result in a more powerful test. For example, if the population distribution is not symmetric, a transformation may produce symmetry.
Often, the effect of an assumption violation on the one-sample t test result depends on the extent of the violation (such as how skewed the distribution of the population is). Some small violations may have little practical effect on the analysis, while other violations may render the one-sample t test result uselessly incorrect or uninterpretable. In particular, a small sample size may increase vulnerability to assumption violations.
The one-sample t statistic is based on the sample mean and the sample variance of the sample values, both of which are sensitive to outliers. (In other words, neither the sample mean nor the sample variance is resistant to outliers, and thus, neither is the t statistic.) In particular, a large outlier can inflate the sample variance, decreasing the t statistic and thus perhaps eliminating a significant difference. A nonparametric test may be a more powerful test in such a situation. If you find outliers in your data that are not due to correctable errors, you may wish to consult a statistician as to how to proceed.
Whether or not the population is skewed can be assessed either informally (including graphically), or by examining the sample skewness statistic or conducting a test for skewness.
If outliers or skewness is present, employing a transformation may resolve both problems at once, and also promote normality. In this case, it may be preferable to perform a one-sample t test on the transformed data.
The usual measurement for skewness is not resistant to outliers, so one should be consider the possibility that apparent skewness is in fact due to one or more outliers. A lack of power due to small sample sizes may also make it hard to detect skewness.
For data sampled from a normal distribution, normal probability plots should approximate straight lines, and boxplots should be symmetric (median and mean together, in the middle of the box) with no outliers. If the sample size is not too small, then the t statistic will not be much affected even if the population distributions are skewed, although it will increase the chance that an incorrectly small P value will be reported (i.e., that the null hypothesis will be rejected when it is in fact true.
Unless the sample size is small (less than 10), light-tailedness or heavy-tailedness will have little effect on the t statistic. Light-tailedness will tend to increase the chance that an incorrectly small P value will be reported (i.e., that the null hypothesis will be rejected when it is in fact true. Heavy-tailedness will tend to increase the chance that an incorrectly large P value will be reported (i.e., that the null hypothesis will not be rejected when it is in fact false, making the test conservative.
Robust statistical tests operate well across a wide variety of distributions. A test can be robust for validity, meaning that it provides P values close to the true ones in the presence of (slight) departures from its assumptions. It may also be robust for efficiency, meaning that it maintains its statistical power (the probability that a true violation of the null hypothesis will be detected by the test) in the presence of those departures. The t test is fairly robust for validity against nonnormality, but it may not be the most powerful test available for a given nonnormal distribution, although it is the most powerful test available when its test assumptions are met. In the case of nonnormality, a nonparametric test or employing a transformation may result in a more powerful test.
Even if none of the test
assumptions are violated, a t test with a small sample
size may not have sufficient
power
to detect a significant
departure from the hypothesized mean value, even if
this is in fact the case. The power curve presented
in the results of the t test indicates how likely the
test would be to detect an actual difference between
the hypothesized mean and the population mean.
The shallower the power curve, the
bigger the actual difference would have to be before the
t test would detect it. The power depends on
variance, the selected significance (alpha-) level of the test,
and the sample size. Power decreases as the
variance increases, decreases as the significance
level is decreased (i.e., as the test is made
more stringent), and increases as the sample size
increases.
A very small sample from a population with a mean
very different from the hypothesized value may
not result in a significant t test statistic unless the
sample variance is small.
If a statistical
significance test with small sample sizes
produces a surprisingly non-significant
P value, then a lack of power may be the reason.
The best time to avoid such problems is in the
design stage of an experiment, when appropriate
minimum sample sizes can be determined, perhaps in consultation
with a statistician, before data collection begins.
Examine the glossary.
Do a keyword search of PROPHET
StatGuide.
Back to StatGuide two-sample one-sample t test page.
Back to StatGuide home page.
©1996 BBN Corporation All rights reserved.