PROPHET StatGuide: Possible alternatives if your data violate F test assumptions
If the data for one or both of the samples to be analyzed by a F test
come from a
population
whose distribution
violates the assumption
of normality,
or outliers are present,
then the F test on the original data may provide misleading
results.
Using a nonparametric
or robust test
may provide a better analysis.
Or, simply examining the data graphically may suffice.
Unless you have other reasons to transform the data to get to your
true variable of interest (such as being actually interested in
speed of performing a task instead of time to its completion, which
would suggest taking reciprocals of the collected time data),
transformation is generally not appropriate for dealing with nonnormality
in comparing two sample variances. A transformation that cures the
nonnormality problem often results in making the variances more equal,
defeating the purpose of the test!
Alternative procedures:
- Looking for variance differences without tests:
-
The sample variance is sensitive to outliers.
Other sample statistics
such as the interquartile range, may
give an idea of the variation in either sample without being
affected by outliers. If the sample interquartile ranges
are similar, but the sample variances are quite different,
an outlier in one or both the samples may be the cause.
It is also possible that outliers could make two
sample variances similar, while the interquartile ranges
differ. When the two sets of dispersion measures disagree,
outliers in one or both of the samples may be the reason.
Side-by-side boxplots of the two samples can
suggest differences between the two sample variances
if one boxplot is much longer than the other, and reveal suspected outliers.
- Nonparametric tests:
- Nonparametric tests
are tests that do not make the usual
distributional assumptions of the
normal-theory-based tests.
For the unpaired two-sample t test, the most common
nonparametric alternative test is the
Ansari-Bradley test
Although the Ansari-Bradley test does
not assume
normality
of the distributions for the two sample populations,
it does assume that either the two populations have the
same unknown median, or that both population medians
are known, so that each population median can be
subtracted beforehand from the values in the
corresponding sample.
Otherwise, the test is no longer
distribution-free,
even if the sample median is subtracted from the
values in each corresponding sample.
Also, as with the F test, it is assumed that the two samples are
independent of each other, and that there is
independence within each sample.
If the sampled values do indeed come from populations
with normal distributions,
then the F test is the most
powerful test of the equality
of the two means, meaning that no other test is more likely
to detect an actual difference between the two variances.
- Robust tests:
-
Robust
statistical tests operate well across a wide
variety of distributions.
A test can be robust for
validity, meaning that it provides P values close to the true ones
in the presence of (slight) departures from its
assumptions. It may also be robust for efficiency,
meaning that it maintains its statistical
power (the
probability that a true violation of the
null hypothesis
will be detected by the test) in the presence of
those departures.
Levene's test is
reasonably robust for validity against
nonnormality. Another test created by Box and Anderson
may be more powerful than Levene's test when
nonnormality is caused by heavy-tailedness.
Do a keyword search of PROPHET
StatGuide.
Back to StatGuide F test page.
Back to StatGuide home page.
Last modified: March 14, 1997
©1996 BBN Corporation All
rights reserved.