PROPHET StatGuide: Possible alternatives if your data violate rank sum test assumptions
If the data for one or both of the samples to be analyzed by a rank sum test
come from a
violates the assumption
distributional shape, or if
outliers are present,
then the rank sum test on the original data may provide misleading
results, or may not be the most powerful test available.
the data to promote normality and then performing a
two-sample t test,
or using another nonparametric test
may provide a better analysis.
- Transformations (a single function applied to each
data value) can be applied to correct problems of
Transforming the two samples to remedy nonnormality often results
in correcting heteroscedasticity (unequal dispersions).
If such a transformation can be found, the transformed
data may be suitable for use with a
two-sample t test.
The resulting test may be more powerful than
the original rank sum test.
The same transformation
should be applied to both samples.
theory suggests a specific transformation a priori,
transformations are usually chosen from the "power family"
of transformations, where each value is replaced by
x**p, where p is an integer or half-integer, usually
- -2 (reciprocal square)
- -1 (reciprocal)
- -0.5 (reciprocal square root)
- 0 (log transformation)
- 0.5 (square root)
- 1 (leaving the data untransformed)
- 2 (square)
For p = -0.5 (reciprocal square root),
0, or 0.5 (square root), the data values must all be
positive. To use these transformations when there
are negative and positive values,
a constant can be added to all the data values
such that the smallest is greater than 0 (say,
such that the smallest value is 1). (If all
the data values are negative, the data can
instead be multiplied by -1, but note that
in this situation, data suggesting
to the right
would now become data suggesting skewness to the left.)
To preserve the order of the original data
in the transformed data, if the value of p is
negative, the transformed data are
multiplied by -1.0; e.g., for p = -1,
the data are transformed as x --> -1.0/x.
Taking logs or square roots tends to "pull in"
values greater than 1 relative to values less
than 1, which is useful in correcting skewness
to the right. Transformation involves changing
the metric in which the data are analyzed, which
may make interpretation of the results difficult if the
transformation is complicated. If you are unfamiliar
with transformations, you may wish to consult a
statistician before proceeding.
- Other nonparametric tests:
- Although the rank sum test is the most commonly used
alternative to the unpaired two-sample t test,
it is not the only one. However, all tests assume that the
two samples are independent
of each other, and that there is
independence within each sample.
A median test can be calculated
by creating a 2x2 contingency table
of counts of the values in each sample that are greater or not greater
than the median of both samples together. Then
this contingency table can be tested by a
test or Fisher's exact test.
This test does not assume
equality of dispersions,
but is likely to be less powerful
than the rank sum test when the dispersions are in fact comparable.
- Unpaired two-sample t test
- If the sampled values do indeed come from populations
with normal distributions,
then the unpaired
two-sample t test is the most
powerful test of the equality
of the two means, meaning that no other test is more likely
to detect an actual difference between the two means.
(If a distribution is symmetric, its mean and median
are both equal to the center of symmetry. Since the
normal distribution is symmetric, the t test can also
be viewed as testing whether the difference between the two sample
medians is 0, if the normality assumption holds.)
If the population distributions are not normal, however,
the rank sum test may be more powerful at detecting
differences between the sample medians.
If applying a transformation
promotes normality, the unpaired two-sample t test
may be a more powerful test than the rank sum test for
the transformed data.
Do a keyword search of PROPHET
Back to StatGuide testing equality of means/location page.
Back to StatGuide nonparametric tests page.
Back to StatGuide rank sum test page.
Back to StatGuide home page.
Last modified: March 18, 1997
©1996 BBN Corporation All